Attending my first international R conference [EARL 2018]
What is EARL
EARL stands for Enterprise Applications of the R Language. It is a cross-sector conference focusing on the commercial use of the R programming language and is dedicated to the real-world usage of R with some of the world’s leading practitioners. It is organised by Mango Solutions which is a Data Science solutions company and primararily uses R for most of its offerings and open source contributions. You can check there Github Repo here. The conference was started in 2014 where it was called as Effective Applications of the R Language. Since then, the conference happens every year in London, UK.
The conference - 2018
I was visiting London in the same week as the conference so I got in touch with Liz Matthews who is the head of community and education at Mango solutions and thus managed to squeeze an entry just a day before the conference started and I’m glad I did that. Being only exposed to the conference content online, I wanted to experience it first hand, talk to the attendees, the organisers and meet some amazing people that we follow on social media platforms and just have a conversation with them, if possible. The agenda of the conference looked pretty interesting and I was definitely looking forward to the talks. Here are some that I managed to attend and some of there highlights.
A fascinating talk.She spoke from experience and shared some of the amazing user stories from the likes of DunnHumby, Tesco and M&S. Also shared her views on data privacy, how GDPR affects decision making (in a good way) and the role of women in tech.
One of the best talks of the conference, it was all about reproducibility and data science. Lot of practical examples and features of RMarkdown and also a discussion on the recent war of words on Jupyter Notebooks and the First Notebook war by Yihui Xie. Garrett is an amazing speaker and I learnt a lot by just listening to him on this stage.
Jobst Loffler, Bayer Business Services
This talk was about how the combination of R, SAS, Python and Stata is been used currently in the drug discovery use case and how Bayer is working towards a validated R environment to make development a lot more structured, easy to validate and easy for external users to contribute. Also talked about how R is massively used in the Pharma industry and the whole ecosystem of packages getting developed.
Another great presentation. David showcased the Microsoft Custom Vision API to train and use a custom vision recognizer, and also how easy it is to use these to quickly prototype a Shiny application. Check out the repo to know more.
We got to know about one of the most interesting use cases of Auto-ML and how a small prototype lead to signing of a player by a major Baseball club in the states. I knew that AutoML is amazing and Kagglers all across the globe have started using it to reach on top of the leaderboards, but got to view its application here in combination with LIME which is yet another amazing tool to understand the model output in depth.
A 101 on LIME(local interpretable model-agnostic explanations). How it is used across Aviva and its products (insurance). Lime was originally built in Python which was later interfaced with R by the famous Thomas Lin Pedersen. I would definitely recommend to go through the package vignettes to know more
First TensorFLow talk and it was all about how Andrie used it to classify support tickets, using natural language processing with recurrent layers, to informs recommendations on Zendesk ticketing system. It was also about his experience of learning a new technology, moving faster and not getting stuck in the perfectionism of things to get to results and then iterate.
This was one of the most stats heavy talk of the conference, but I must say that Tim is a great speaker and the presentation was smooth. He shared his thoughts on SCAM (Shape Constrained Additive Models), a relatively new technique that enables the user to quickly and painlessly generate a non-linear model from a data set, whilst also incorporating their intuition (or domain knowledge) on how each variable should behave on a ‘macro’ level, he also talked about model assessment techniques and the perils of overfitting
After these, there were some lightning talks as well:
- Patrik Punco, NOZ Medien, Subscription Analytics with focus on Churn Pattern Recognition in German News Company
- Jasmine Pengelly, Freelance, Putting the ‘R’ in bar
- Robin Penfold, Willis Towers Watson, Using network analysis of colleague relationships to find interesting new investment managers
- Matthias Trampisch, Boehringer Ingelheim, Experience of using R in the productive environment of Boehringer Ingelheim
- Andreas Wittmann, MAN Truck & Bus AG, Visualising huge amounts of Fleet Data using Shiny and Leaflet
- Ansgar Wenzel, Qbiz UK, An analysis of UK MOT results – Why does my car always fail?
- George Cushen, Shop Direct, Creating an awesome documentation website for your product/service with RMarkdown and Academic
- Agnes Salanki, Hotels.com, The whole is greater! A domain-specific size calculator case study at Hotels.com
- Mike K Smith, Pfizer, Managing and deploying R packages across an organisation – Cat-herding 101
I must say that both the organisers and the speakers did a fantastic job and the last talk ended at 5 pm just as planned. Kudos to the team.
Summarising the experience:
- I feel its a challenge to summarise your talk in 15-20 mins, give time to the audience, make it interactive and encourage a healthy discussion. Attending the conference,I got to see a wide spectrum of speakers, some did this exceptionally well while everyone gave there best. This is an important skill for a speaker.
- Some talks were good from a theoritical perspective, but lacked real world examples and scenarios which the author/speaker faced. Including such examples is not always required, but whenever you do, there is an instant connect with the audience
- GIF’s which show code(alternate to live coding) are way better than static code images. Garrett, from RStudio, used a couple of them in his presentation and everyone was just glued to the screen, trying to understand the code and the logic.
- The real learning happens before and after the talk. Having one to one discussions with speakers is really important, even if you understood everything, just appreciating the speaker after there talk goes a long way in building relationships.
- The R ecosystem is immense. I not only met Data Scientists, but people from a lot of different backgrounds and roles who are using it in some form or the other. Just gives you a sense of how open the community is and how easy it is for a new comer to just enter and explore.
- The community is amazing, yes I know this is now a cliche, but once you start communicating, you realise how fantastic this bunch is. I was lucky to meet Mark Sellors from Mango Solutions, Sean Lopp from RStudio and several other amazing folks.
A quick note to the Indian folks who want to experience such conferences -
I know that we have a dearth of conferences like these in our country and attending such international conferences is not easy if our trip is not funded, by the employer or by other means, so I would just say to do your best, engage and help the community through StackOverflow, discourse channels, work and maintain open source packages, and just keep on documenting your work in a way it is reproducible and useful for others. The best way to attend such conferences is to present your own stuff, but your community contributions are of prime importance and they open door for a lot of opportunities ahead.
The more you give, the more you get back.