Insights

Insights from Arbol's Data Engineering Team

September 7, 2023
Arbol QA Series featuring Christian Lemp Arbol QA Series featuring Christian Lemp

Welcome to Arbol’s Expert Insights Series. This Q&A session, with our seasoned data engineers, takes a behind-the-scenes look at how Arbol handles data and is laying the groundwork to consistently innovate in the risk-management space.

Q: In the years since Arbol was founded in 2018, how has the data team’s operational framework changed?
Since Arbol has matured beyond a small startup into a growth company, the Data team has needed to to match that growth. Our modus operandi is clear: accelerate the growth of a competitive insurance and derivatives business, support a data-driven culture at Arbol, be model citizens of dClimate, and create a culture of accountability, diversity, and career opportunities on the Data team. This vision laid the foundation for a number of ambitious projects, from re-architecting our climate data warehouse, to creating a Business Intelligence function, to establishing more well defined career tracks for the team.

This vision laid the foundation for a number of ambitious projects, from re-architecting our climate data warehouse, to creating a Business Intelligence function, to establishing more well defined career tracks for the team. The engineers at Arbol are truly extraordinary and I’m happy I got to join the team and work with them.
Q: How is Arbol's tech stack uniquely positioned to address climate risks?
Arbol uses a unique combination of traditional cloud and cutting-edge web3 technologies. This hybrid approach allows us to offer highly scalable and efficient data solutions for climate risk forecasting, specifically tailored for our Risk and Pricing teams. Additionally, our infrastructure is designed to be transparent and well-structured, making it ideal for blockchain applications such as dRe Lifecycle, which is managed by our Office of Innovation.

Moreover, we have open sourced some of our internal tools - gridded-etl-tools and nettle to transform climate data from a variety of non-standard formats into standardized climate datasets optimized for analysis and building applications
Q: How does Arbol use machine learning and artificial intelligence to assess risk and price policies?
Arbol’s risk and pricing engine uses large amounts of climate data to identify key trends and hidden patterns with machine learning and AI algorithms. The Data team supports this work by ensuring our climate datasets are accurate, complete, and optimized for high performance analytics. The trends these models discover give Arbol a perspective on future weather events, which is incorporated into pricing and portfolio risk management.
Q: Please describe some of the challenges that the team has faced developing Arbol's tech stack and how they were overcome?
One challenge the team recently faced was how to serve very large gridded datasets of climate activity - which can be up to 5 terabytes in size - efficiently over IPFS, one of our core data procotols. The team ended up combining several technologies and standards - Zarr and IPLD - into a solution that uses specialized data formats which integrate very well with the IPFS distributed storage protocol. The result was a 90% improvement in query performance in some cases. You can read more about this work in a blog post Introducing Zarrchitecture on dClimate or you can check out our open source toolkit which bundled this custom solution into an open package for climate data developers.
Q: How does Arbol's platform integrate with other systems used by clients to manage risk, such as weather monitoring or satellite data?
Arbol has partnered with the RiskStream Collaborative, a network of insurers collaborating on blockchain solutions for risk innovation. Arbol’s dRe Lifecycle dashboard application provides real time parametric loss calculations to participants using smart contracts hosted on the Canopy network. The weather data my team maintains on IPFS is used in smart contracts to send updates based on real world events. This use of IPFS allows for external data to be shared and stored in a secure manner with immutability guarantees.
Q: What is one complex data problem you’ve solved that you think our audience would benefit from learning about?
Managing a variety of climate data for different uses - research, pricing, applications - is a complex problem in itself. The sheer volume and variety of data - we currently maintain over 70 terabytes of data - with some single datasets being more than one terabyte. The size of this data requires us to use specialized tools (and even build our own) to process and transform raw data into a format optimized for high power analytics. Additionally, we receive station data from multiple sources and have had to develop our own in-house schema to standardize these formats so that we can create custom, blended datasets for climate analysis that spans the entire earth.
Q: What Industry trends or technologies have recently caught the team’s attention?
We see big opportunities for blockchain technology as a tool to create efficiencies and trust within complex multi-party transactions. Looking beyond the consumer hype that has dominated the crypto space over the past few years, there has been an increase in the number of private, small blockchain networks, which have been developed through a consortium of institutions, that all rely on each other to complete transactions - like reinsurance, collateralized securities, or carbon markets.

Then, of course, the deployment of Large-Language Model AI is a huge step forward for discovering and synthesizing information based on collective knowledge. In principle, these should create opportunities to build bespoke applications with AI trained on specific domain knowledge to create custom search and discovery tooling.

And generally, any creative visualization of complex systems (like weather patterns) to illustrate how interactions of individual components give rise to macro scenarios is interesting for research and analysis. Ultimately these micro conditions give rise to the events that Arbol helps our customers and partners protect against.
Q: Looking ahead, what new technologies or capabilities is Arbol exploring to further improve its platform and better serve clients?
Currently, Arbol is working closely with the dClimate team to create developer tools capable of sophisticated climate analytics, and to make source climate data accessible over IPFS. As part of the process, we are reworking our internal codebase – which we’ve developed over the past three years – as well as releasing bundles of open source microservice toolkits.

Looking ahead we’re focused on AI in a couple different areas:
  1. Building extremely high powered computing environments to analyze climate patterns across the entire planet for many variables to tune and simulate scenarios for our risk and pricing models.
  2. Using LLMs like ChatGPT and similar technologies to help identify exactly which data source is best suited for a particular climate risk scenario.

To keep abreast of all of the exciting initiatives underway at Arbol, we invite you to follow us on LinkedIn and X.

Disclaimer

Continue reading