The Last Analytics Company

The analytics market is crowded - there are countless companies offering nearly identical services. What’s worse, the technical task of recording analytics has become easy: many technologies throughout the stack make collecting, storing, and analyzing granular data accessible to layman software engineers. Somehow, though, a clear winner hasn’t emerged in the analytics space. The stage is set for the last analytics company. The promise of analytics tools is to help gather and analyze data recorded from products and services to yield invaluable business insights. Only with analytics can you answer questions like “Does this copy convert better?” or “Do users actually know how to use feature X?” The broad availability of analytics in the past 5 years has created fast growing companies keenly in-touch with their customers’ needs, and has helped large organizations keep pace in quickly evolving markets. ...

October 8, 2015 · 3 min · Stuart Axelbrooke

Python 3 on Spark - Return of the PYTHONHASHSEED

If you’re anything like me, you’ve been stuck using Python 2 for the last 10 years, and for 8 of them you’ve been trying to switch to 3. Since the release of Spark and PySpark 1.4, Apache has started supporting Python 3 - fantastic! But then appears the omen of doom, like Death cackling at your final moments: Exception: Randomness of hash of string should be disabled via PYTHONHASHSEED Try in vain you may to set PYTHONHASHSEED: export it on the master? Set it in spark-env.sh? Possibly pssh -h slaves 'export PYTHONHASHSEED=0'? ...

September 9, 2015 · 1 min · Stuart Axelbrooke

Pipelining - A Successful Data Processing Model

It’s finally time to implement that new personalization service — the one you’ve been pushing for for months. With it, your app will be serving up relevant, personalized content to every user. But the further you look into it, the more you furrow your brow — lots of processing must happen to properly prepare user data, and the prospect of expanding it to a much larger population creates performance concerns. ...

March 11, 2015 · 4 min · Stuart Axelbrooke