Skip to content

Conversation

christinahedges
Copy link
Contributor

@christinahedges christinahedges commented Mar 18, 2022

This PR refactors a lot of our API to make sure we have no in-place centroids. Here are some top changes:

  1. We now have a rough_mask, which is our first pass at the masking
  2. The source_mask is now made in a slightly more robust and clear way
  3. FFIMachine now loads images in a more memory efficient way
  4. We're no longer using that rtol to clip out faint parts of the PSF. We just use the atol, i.e. where the PSF has counts greater than some value.
  5. For FFIMachine, we don't remove pixels that are saturated or "bad", we just make sure that in all the masks (rough_mask, source_mask, uncontaminated_source_mask) these are False.
  6. I added a different background correction that uses splines to fit a smooth background model. This works well for TESS...I need to test it for Kepler FFIs
  7. Makes FFIs work without removing pixels

This PR supersedes #48

To Do

  • Check this runs for Kepler FFIs!

@christinahedges
Copy link
Contributor Author

christinahedges commented Mar 18, 2022

This explains what each of the three masks are:
image

@christinahedges
Copy link
Contributor Author

There's a notebook here which shows how to use this PR for getting the TESS photometry out...

Copy link
Contributor

@jorgemarpa jorgemarpa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the near future, we need to merge the changes from #60 including the perturbation API. I think after that the only 'big' changes to machine.py are the way to compute the source_mask and centroids.

Comment on lines +216 to +233
@property
def dx(self):
"""Delta RA, corrected for centroid shift"""
if not sparse.issparse(self.dra):
return self.dra - self.ra_centroid.value
else:
ra_offset = sparse.csr_matrix(
(
np.repeat(
self.ra_centroid.value,
self.dra.data.shape,
),
(self.dra.nonzero()),
),
shape=self.dra.shape,
dtype=float,
)
return self.dra - ra_offset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core change to avoid in-place operation. Does this adds overhead when calling self.dx, I don't think it'd matter much though, only for the case of large sparse data.

Comment on lines +254 to +258
def _update_delta_arrays(self, frame_indices="mean"):
if self.nsources * self.npixels < 1e7:
self._update_delta_numpy_arrays()
else:
self._update_delta_sparse_arrays()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be renamed to just _create_delta_*_arrays() (or just _delta_*_arrays()) there's no "update" happening there.
The frame_indices="mean" argument I think is unnecessary now.

Comment on lines +315 to +316
if frame_indices == "mean":
frame_indices = np.where(self.time_mask)[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't doing anything in the function.

plot=False,
):
"""Find the pixel mask that identifies pixels with contributions from ANY NUMBER of Sources
def _get_source_mask(self, source_flux_limit=1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this new version of _get_source_mask looks "simpler" than the original one, I mean, with less tunable params, which I like.

Is iterating 2 times good to converge into a solid source_mask?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A diagnostic plot for this one would be great, something to check we are capturing well the f, r dependency.

# mask out non finite values and background pixels
k = (np.isfinite(wgts)) & (
self.uncontaminated_source_mask.multiply(self.flux[t]).data > 100
def _get_centroid(self, plot=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need short documentation here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, this method computes a single centroid value for all frames. Do we want to also have centroids in each frame?

return

def _remove_background(self, mask=None):
def _remove_background(self, mask=None, pixel_knot_spacing=10):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can get rid of photutils dependency

Comment on lines +112 to +121
thumb = np.min(self.flux, axis=0).reshape(self.image_shape)
gthumb = np.hypot(*np.gradient(thumb))
mask = (
~sigma_clip(
np.ma.masked_array(gthumb, gthumb > 500),
sigma=3,
cenfunc=lambda x, axis: 0,
).mask
).ravel()
self._remove_background(mask=mask)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can happens after super.__init__ now.

Comment on lines +583 to +591
pixel_mask = self.non_sat_pixel_mask & self.non_bright_source_mask
self.rough_mask = self.rough_mask.multiply(pixel_mask).tocsr()
self.rough_mask.eliminate_zeros()
self.source_mask = self.source_mask.multiply(pixel_mask).tocsr()
self.source_mask.eliminate_zeros()
self.uncontaminated_source_mask = self.uncontaminated_source_mask.multiply(
pixel_mask
).tocsr()
self.uncontaminated_source_mask.eliminate_zeros()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great, never thought on including the sat/bright pixel mask into the machine mask, this way we can keep all original pixels and do nice image plots
😃



def _combine_A(A, poscorr=None, time=None):
def _combine_A(A, time, poscorr=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one will disappear after merging with perturbation API

)


def _find_uncontaminated_pixels(mask):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this one belongs here, it used t be a hidden method of machine.py

christinahedges added a commit that referenced this pull request Dec 8, 2022
Extracted out of #54, this is just a slightly more robust version of this part of the code.
@jorgemarpa jorgemarpa changed the title Refactor API to remove in-place centroids, and update FFIMachine Refactor API to remove in-place centroids, and update FFIMachine [DO-NOT-MERGE] Dec 13, 2022
@jorgemarpa
Copy link
Contributor

jorgemarpa commented Dec 13, 2022

I isolate the efficient FFI changes and open a new PR #71. We'll keep this PR open for future reference when including the in-place operations and new source mask methods.

@jorgemarpa jorgemarpa changed the title Refactor API to remove in-place centroids, and update FFIMachine [DO-NOT-MERGE] Refactor API to remove in-place centroids, and update FFIMachine [do-NOT-merge] Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants