TDS Editors

Towards Data Science

February might be the shortest month, but it certainly didn’t feel this way here at TDS, where our authors have been on top of their game, sharing strong contributions on timely topics — including some of the longest and most-read articles of the year so far.

Now that most of us have settled into the flow of things in 2024, we see our readers focus slightly less on career moves and more on core skills and concrete solutions to common issues. Our most-read and -discussed articles of the past month reflect that, and below you’ll find a representative sample of our February standouts.

Monthly Highlights

  • The Math Behind the Adam Optimizer
    In a clear, accessible, and widely shared explainer, Cristian Leo unpacks the mathematical inner workings of the Adam (Adaptive Moment Estimation) optimizer and, along the way, helps us understand why it’s become such a popular choice among deep learning practitioners.
  • 12 RAG Pain Points and Proposed Solutions
    While retrieval-augmented generation continues to make waves as a powerful option for boosting LLMs’ performance, its shortcomings are becoming clearer, too. Wenqi Glantz offers a useful resource for anyone who’s felt stuck implementing a RAG system recently, compiling 12 common pitfalls as well as suggested workarounds.
  • Data Visualization 101: Playbook for Attention-Grabbing Visuals
    For anyone looking to create “clearer, sharper and smarter visuals”—and who isn’t, really?—the latest data-visualization guide by Mariya Mansurova is essential reading, as it leverages numerous concrete examples (in Plotly) to showcase essential design principles in action.
Better Visualizations, Advanced ETL Techniques, RAG Pain Points, and Other February Must-Reads | by TDS Editors | Feb, 2024 - image  on
Photo by Kelly Sikkema on Unsplash
  • Advanced ETL Techniques for Beginners
    If you’re an early-stage data engineer who’d like to give your data-ingestion skills a boost, 💡Mike Shakhomirov’s new tutorial is one you should definitely explore (and bookmark): it covers typical ingestion patterns and provides code snippets you can use to start tinkering on your own.
  • Advanced Retrieval-Augmented Generation: From Theory to LlamaIndex Implementation
    Interested in diving further into the exciting world of RAG? Leonie Monigatti explains the nitty-gritty details of pre-retrieval, retrieval, and post-retrieval optimizations, before walking us through the process of transforming a “naive” RAG pipeline into an advanced one.
  • Top Evaluation Metrics for RAG Failures
    We turn to RAG one final time this week, this time for Amber Roberts’s most recent contribution: a handy resource on troubleshooting unexpected or underwhelming performance, and on applying robust response and retrieval evaluation metrics to ensure all the pieces in your pipeline are working in harmony.
  • Building a Data Platform in 2024
    Three years after first tackling this topic, we were thrilled to welcome back Dave Melillo, whose new post reevaluates the key components of efficient data platforms. He shares valuable insights based on his experience navigating the data challenges of various industries, and having worked with both “large corporations and nimble startups.”

An Extra Dose of Python

Some of our most popular posts in the past few weeks covered the always-timely topic of Python programming for data and ML professionals. In case you missed them:

Our latest cohort of new authors

Every month, we’re thrilled to see a fresh group of authors join TDS, each sharing their own unique voice, knowledge, and experience with our community. If you’re looking for new writers to explore and follow, just browse the work of our latest additions, including Sarthak Handa, Vadim Arzamasov, Mahyar Aboutalebi, Ph.D. 🎓, James W, Mohammed Mohammed, Kirsten Jiayi Pan, Matthew Chak, Ugur Yildirim, Mikayil Ahadli, Hamza Gharbi, Sami Abboud, Matthew Gunton, Eivind Kjosbakken, Eva Revear, Nithhyaa Ramamoorthy, Rami Krispin, Kennedy Selvadurai, PhD, Vassily Morozov, Patrick Beukema, Thomas Rouch, Ritanshi Agarwal, Rohan Nanda, Nikolaus Correll, Mert Ersoz, Dani Lisle, Roberta Rocca, Adil Rizvi, Matthew Turk, Celia Banks, Ph.D., Skylar Jean Callis, Ryan McDermott, Anand Subramanian, Aayush Agarwal, P.G. Baumstarck, Jose D. Hernandez-Betancur, Khin Yadanar Lin, and Daniel Kang, among others.

Source link