-
Notifications
You must be signed in to change notification settings - Fork 843
Newsletter debug - add subscribers list to meta on publish #46068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
|
Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.
Interested in more tips and information?
|
|
Thank you for your PR! When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:
This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖 Follow this PR Review Process:
If you have questions about anything, reach out in #jetpack-developers for guidance! Jetpack plugin: The Jetpack plugin has different release cadences depending on the platform:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. |
Code Coverage SummaryCannot generate coverage summary while tests are failing. 🤐 Please fix the tests, or re-run the Code coverage job if it was something being flaky. |
|
|
||
| while ( true ) { | ||
| $api_path = sprintf( | ||
| '/sites/%d/subscribers/?page=%d&per_page=%d&filter=email_subscriber', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're fetching all subscribers, do we need that extra email_subscriber filter? If we do, do you think you could explain why that's needed in the docblock for the method?
Since we already have other functions used to fetch subscribers in the codebase, I think that if we add a new one we need to be extra clear what it does, and how it differs from the others.
Alternatively, maybe that should be a parameter that could be passed to the method, this way this method can become the one method we use for everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're fetching all subscribers, do we need that extra email_subscriber filter? If we do, do you think you could explain why that's needed in the docblock for the method?
I think this limits it to only subscribers that would be sent emails, excluding people who are only subscribed via the reader with no emails. So we don't NEED to limit it here, but it didn't seem like the others would be very useful for our purposes.
| $page = 1; | ||
| $per_page = 100; // Maximum per page to minimize requests. | ||
|
|
||
| while ( true ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're potentially making multiple calls to WordPress.com to get all that data, I think this should be saved in a transient, to save any too frequent calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I generally agree we should limit calls, how long do we think that transient should last?
If we limit running this to posts being published, and return early if the meta is already set, then I think any extra calls that would be made would be triggered by a new separate post being published? (Or is there another means I am not thinking of?). If in that time between 2 posts being published the subscriber list changed, that transient would then also have us adding stale data on the meta that might not be very helpful for the debugger. If we are setting the timer for the transient much shorter to avoid that situation, then its unlikely to have any effect but also seems completely fine/appropriate to add it as a safety.
Thoughts?
| if ( isset( $subscriber['email_address'] ) && is_email( $subscriber['email_address'] ) ) { | ||
| // Determine if subscriber has an active paid plan. | ||
| $is_paid = false; | ||
| if ( isset( $subscriber['plans'] ) && is_array( $subscriber['plans'] ) ) { | ||
| foreach ( $subscriber['plans'] as $plan ) { | ||
| if ( isset( $plan['status'] ) && 'active' === $plan['status'] ) { | ||
| $is_paid = true; | ||
| break; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| $subscriber_emails[] = array( | ||
| 'email' => sanitize_email( $subscriber['email_address'] ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could consolidate things a bit with $subscriber['email_address']. If we check is_email, I would assume that sanitize_email wouldn't run into issues later?
| 'email_subscribers' => isset( $subscriber_data['email_subscribers'] ) ? (int) $subscriber_data['email_subscribers'] : 0, | ||
| 'paid_subscribers' => isset( $subscriber_data['paid_subscribers'] ) ? (int) $subscriber_data['paid_subscribers'] : 0, | ||
| 'all_subscribers' => isset( $subscriber_data['all_subscribers'] ) ? (int) $subscriber_data['all_subscribers'] : 0, | ||
| 'subscriber_list' => isset( $subscriber_data['subscriber_list'] ) && is_array( $subscriber_data['subscriber_list'] ) ? $subscriber_data['subscriber_list'] : array(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a lot of subscribers, that will make for a huge amount of post meta, saved for each post. That seems like it would be problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a main point of concern. With most newsletters there should be no issue, and leaving this out of rest and api responses is helpful, but once we start getting into extremes of tens to hundred of thousands of subscribers it starts becoming a bit more concerning.
Two possible options:
- We choose to support this for sites with up to X subscribers -- either not adding the subscriber list at all or noting it truncated/incomplete once it reaches the cutoff.
- Scrap the idea of adding this post meta. For debugging we could use the live subscribers list on the dashboard and filter out any subscribers that have joined since after the publish date. That does add more limitation to the debugger idea tho as any subscribers that unsubscribed, unsubbed & resubbed, or changed their paid tier status since the original publish wouldn't be reflected correctly in the tool.
| // First, get subscriber counts from stats endpoint. | ||
| $stats_path = sprintf( '/sites/%d/subscribers/stats', $site_id ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have logic to fetch from that endpoint in
Lines 58 to 67 in b84f02d
| if ( Jetpack::is_connection_ready() ) { | |
| $site_id = Jetpack_Options::get_option( 'id' ); | |
| $api_path = sprintf( '/sites/%d/subscribers/stats', $site_id ); | |
| $response = Client::wpcom_json_api_request_as_blog( | |
| $api_path, | |
| '2', | |
| array(), | |
| null, | |
| 'wpcom' | |
| ); |
Maybe we can take the opportunity to consolidate things into one central method? That would be helpful I think, given that we already fetch subscribers in multiple places in the codebase.
|
|
||
| // First, get subscriber counts from stats endpoint. | ||
| $stats_path = sprintf( '/sites/%d/subscribers/stats', $site_id ); | ||
| $stats_response = Client::wpcom_json_api_request_as_blog( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same like for the other API call, I think we should save that data in a transient to save outgoing calls when possible.
|
|
||
| add_filter( 'jetpack_published_post_flags', array( $this, 'set_post_flags' ), 10, 2 ); | ||
|
|
||
| add_action( 'jetpack_published_post', array( $this, 'store_subscribers_when_sent' ), 10, 3 ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Automattic/jetpack-vulcan I would have a question about this. Do we need to sync the new _jetpack_newsletter_subscribers_when_sent post meta when we do this, to ensure we have access to that data on WordPress.com (we will be using that post meta in a page on WordPress.com).
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - currently with this PR checked out and after creating a new post with Newsletter enabled, _jetpack_newsletter_subscribers_when_sent exists as a meta key for the post on the remote site, but not on the cache site.
If the post meta should be available on WordPress.com, it would need to be whitelisted
(in the Sync package and then in WPcom in the jetpack mu-plugin, in $post_meta_whitelist within sync/class.jetpack-sync-defaults.php)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separately it looks like a Sync test needs updating for this PR - test_sends_publish_post_action. The failing assertion is due to another event being queued after the publish hook, so the most recent event isn’t jetpack_published_post anymore (but updated_option). Instead, it looks like we should assert that an event with action jetpack_published_post exists for the post ID, so we're testing for intent but not ordering.
@Addison-Stavlo I'd be happy to work on that in a separate PR so it passes for changes made here if it would be helpful, since it's a change that looks like it should happen anyway.
Edit to add - I've started on a PR for the Sync test here: #46105
Fixes # https://linear.app/a8c/issue/NL-95/newsletters-add-post-meta-for-subscribers-list-at-time-of-first
Proposed changes:
_jetpack_newsletter_subscribers_when_sentmeta when a post is published, populated with information about the subscriber list at the time: timestamp, numbers for different types of subscribers (email, paid, all), and a subscribers list of the email subscribers with fields 'email' and 'is_paid':Other information:
Jetpack product discussion
Does this pull request change what data or activity we track or use?
Testing instructions:
_jetpack_newsletter_subscribers_when_sentfor the post that was published.