Skip to content
This repository was archived by the owner on Mar 5, 2024. It is now read-only.

Conversation

scottjab
Copy link
Contributor

@scottjab scottjab commented Nov 3, 2016

Sequins will start 501ing if zookeeper has a blip. This PR attempts to cache partition state between blips.

r? @colinmarc
cc? @charleseff

partitions.go Outdated
replication: replication,
local: make(map[int]bool),
remote: make(map[int][]string),
old: make(map[int][]string, 1024),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old isn't really a descriptive name. maybe disappeared?

partitions.go Outdated
}
}

// Keep track of old peers in case zookeeper goes away.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't particularly helpful. I think it'd be better to add documentation to FindPeers about the new functionality.

partitions.go Outdated
for _, node := range unDedupedPartition {
if !found[node] {
found[node] = true
p.old[partitionId] = append([]string{node}, p.old[partitionId]...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd probably be slightly more efficient to append rather than prepend, and then truncate as part of the dedupe (working backwards). Even better, though, would be to just have a separate dedupe method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this always adds the nodes to this list, when we just want to add them if they got removed above.

partitions.go Outdated

oldPeers := make([]string, 1024)
copy(oldPeers, p.old[partition])
// Append old peers to peer list, in case of Zookeeper issues.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I don't think this comment tells the reader much (if they don't already know what's up).

partitions.go Outdated
oldPeers := make([]string, 1024)
copy(oldPeers, p.old[partition])
// Append old peers to peer list, in case of Zookeeper issues.
peers = append(peers, oldPeers...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means the dead peers will get shuffled into the live peers by the caller of this code. I think it'd be better if FindPeers return shuffled live peers + dead peers at the end.

partitions.go Outdated
unDedupedPartition := append(newPartition, p.old[partitionId]...)
found := map[string]bool{}

// Shitty dedupe, iterate though the remote peers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Language!

partitions.go Outdated
}
}

func (p *partitions) deDupe(nodes []string) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just call this dedupe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also there's no reason for it to be on p: it can just be func dedupe(s []string) []string

partitions.go Outdated

func (p *partitions) deDupe(nodes []string) []string {
found := map[string]bool{}
dedupedNodes := make([]string, len(nodes))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be make([]string, 0, len(nodes)) or you're going to have extra empty values at the end

partitions.go Outdated
}
}

for partitionId, partition := range remote {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still isn't checking that the remote peer actually disappeared (ie, that it used to be in p.remote and now is not)

partitions.go Outdated
}

for partitionId, partition := range remote {
newPartition := make([]string, len(partition))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call this peers I think

partitions.go Outdated
newPartition := make([]string, len(partition))
copy(newPartition, partition)

unDedupedPartition := append(newPartition, p.disappeared[partitionId]...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need the temp variable here.

partitions.go Outdated
return path.Join(p.zkPath, fmt.Sprintf("%05d@%s", partition, p.peers.address))
}

// getPeers returns the list of peers who have the given partition available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to update the docstring. However, I still think it'd be cleaner to just return one list, and consider the peers "ordered" by priority. That gives us the freedom to actually add in priority later, if we want to do some light downweighting of peers from which we've seen errors.

@colinmarc
Copy link
Contributor

Speaking of the build failing, this would be great with some tests!

@scottjab-stripe scottjab-stripe force-pushed the scottjab/cache-old-peers-for-a-bit branch from c8be9fe to e18c85e Compare December 1, 2016 21:48
return p
}

// Dedupelicates elements in a slice of strings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be of the form dedupe deduplicates...

}
}

// FindPeers returns the list of peers who have the given partition available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring is still wrong, and I still think the signature would be better if it just returned a list of peers vaguely prioritized. That way we could trivially add some sort of greylisting for peers that exhibit errors a lot.

serve.go Outdated
}

func shuffle(vs []string) []string {
func shuffle(vs []string, disappeared []string) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will return an arbitrary number of deleted nodes in addition to the live ones, which will change the behavior of proxy subtly. I think we want to try just peers, where dead peers are only considered if there aren't enough live peers to fulfill the replication factor.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants