Ilya Sterin is a software engineer with Nextrials, a clinical trials data management software company and also consults for a variety of startups, specifically dealing with scalable and distributed system architectures. Ilya’s also a book and blog author and avid user and contributor to open source software. When not hacking on yet another software project, Ilya enjoys spending his time coaching his 10 year-old son’s ever growing amount of sports teams. Ilya is a DZone MVB and is not an employee of DZone and has posted 5 posts at DZone. View Full User Profile

Flurry – our 64-bit id generation service

09.06.2013
| 1235 views |
  • submit to reddit

Flurry was inspired by Twitter Snowflake. We had a need for generating unique distributed 64-bit ids to utilize within our applications that are backed by RDBMS. There are numerous approaches to this. A simple (and in some cases my favorite) approach if you only use these ids for storage within a RDBMS is Instagram’s approach. They basically use a stored procedure within Postgres to generate these ids that comprise of time, logical shard id, and auto increment bit components. Postgres has pretty advanced facilities for writing stored procedures and triggers, making this job rather simple. We tried this approach, but due to the fact that we use Mysql, Mysql’s poor stored procedure support, and the fact that Mysql versions before 5.6 don’t seem to have any way to generate a millisecond timestamp, we quickly discarded that idea.

Our next approach was to try Twitter Snowflake and after a day of ripping hair out of our heads for various reasons, decided to write our own. Snowflake is overly complex for someone outside of Twitter to use. Besides not being polished and distributed in an binary fashion, it suffers from having a dependency nightmare. Current head is dependent on older versions of Scala and various other dependencies that suffer from same issues. Upgrading these dependencies isn’t very easy. The fact that there is also an overabundance of twitter libraries that are used for Snowflake and these libraries suffer from same dependency issues, made is pretty easy to make the decision to write our own.

This isn’t meant as a criticism of Twitter. We’ve used other Twitter open source projects and love them. This software is open sourced and although they are nice enough to do that, the priority is to support their internal infrastructure, though changes/modifications are only made when they need it internally it seems or if there are bugs. Last update was a year ago. No viable forks exist to fix the issues I outlined and we didn’t want to fork it as we figured we can start from scratch and make it leaner by forgoing some functionality in order to achieve a clean code base that’s easy to use and extend. We also wanted to make it configurable so you don’t have to change code and recompile in order to change the bit schemes or utilize a different strategy for naming worker hosts.

Flurry was born and after extensive testing internally, we’re confident enough in it’s stability and functionality and are releasing it to the world. It performs on par with Snowflake and is very configurable. There are features that aren’t yet added to the current release that we plan on adding in the near future, but we’re confident that it will benefit others like us with similar needs.

You can see the project source and documention here.

Download the latest release v0.1.0-beta here.

Enjoy!

0
Published at DZone with permission of Ilya Sterin, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)