I did a thing this summer. I was a part of the Google Summer of Code'21 program, working with The Julia Language.
Prelude
This blog is about my GSoC experience - My journey, the things I set out to do, the things I did, and a few of the things that I know I will do next. Okay, that’s about it, but I like starting things off with proper introductions.
The Journey
Into the Rabbit Hole - Hello, Julia
Julia is the language I spent working with for the summer of 2021. I absolutely loved working with it. 10/10 recommend to everyone. Seriously, I do recommend it to everyone I know.
My first real introduction to Julia was thanks to the 18.S191
course. Of course, I had seen it lurking on the TIOBE index and casually heard it in passing, but it had never really caught my attention before. Not the way it would.
The course is great. The language is great. “Great” is also a great understatement for both.
I picked up Julia with virtually no effort. It’s that easy. I won’t pretend to be an expert in the language (because I’m not), but it really is incredibly easy to hit the ground running with Julia. Wink wink.
The Beginning - Hi, Rachel
Disclaimer - Rachel is NOT another programming language.
Rachel is my absolutely amazing mentor. Seriously, folks, I can’t tell you enough about how helpful, supportive, patient, and welcoming she’s been. She’s super nice and has taught me a ton of things already. I owe many thanks to her.
No, I’m not writing all this just so she’d pass me in my final evaluation. I’m writing all this because she’s a kickass human being and even more of a kickass mentor. Happy mentor, happy life.
My first step down the Julia rabbit hole was the official website.
I browsed and shuffled through all the sections, one by one, and finally got to “Projects”. Hmmmmmm. Interesting. *click*.
I was blown away by the projects. By how intense they seemed to be and how sophisticated and complex they sounded. I still am. Some of these projects were concepts I’d never heard about before.
I love a good challenge. I was getting all fired up at the humbling prospect of even beginning to comprehend these ideas.
I found one that excited me the most - DeepChem.jl
(we’ve changed that name a couple of times now, but that’s a story for another day). I liked the general idea a lot. I wasn’t entirely lost either, thanks to my fundamentally science-based background.
Scrolling down, I found just one mentor listed. Dr. Rachel Kurchin. Bonus. For some weird reason, to the novice I was back then, one mentor felt a lot less intimidating than two.
Interacting with the Julia community over the past few months, I’ve developed a completely different outlook. Seriously, the Julia community is simply delightful.
Sooo, what did I do? I cold-mailed her, asking how I could get started and involved. I didn’t expect a reply. You can imagine my surprise when I got one. We had a chat, talked about the project itself, basic things in Julia, and how I could get started. And so it began.
P.S. - DeepChem.jl
was rebranded as Chemellia
.
A Serendipitous Summer Journey
It was a summer of firsts. Rachel’s first time as a GSoC mentor. My first as a GSoC student.
At some point, I realized that since I first read about this project on the JSoC section of the Julia website, I should probably apply as well …right? We discussed a few ideas that she’d like to see implemented. I picked one and started drafting a proposal. Most of this was research-based. Super cool.
I initially planned on submitting three different proposals to The Julia Language for Chemellia
. But I decided against this and felt it’d be better if I submitted one very high-quality proposal instead. Rachel was insanely supportive.
Please feel free to reach out to me if you want to take a look at my proposal.
Imagine my surprise when I got selected. Absolute bliss.
What started with me browsing through a new language simply because I was interested snowballed into my summer adventure.
Serendipity.
The End?
My GSoC'21 journey may have come to a close. But, this by no means is the end of my journey with Chemellia
. Nothing to see here folks, keep reading.
The Project
What it was about
My project was titled Tight-Binding Atomic Graph Neural Networks
.
A summarized, sophisticated, domain-specific, slightly esoteric description of the title would be -
Graph Convolutional Neural Networks that predict properties of crystals, inspired by the Tight-Binding model formalism, modeling the atomic orbital interactions using information from Slater-Koster tables as a priori.
The simpler, shorter version would be -
ML models for Prediction of Properties of Materials using GCNNs, provided a priori scientific information.
But by no means was the scope of my contributions restricted to a narrow view of the project alone. I was also very involved in the development of ChemistryFeaturization.jl
and AtomicGraphNets.jl
.
My proposal aimed to develop an improved, more physics-informed implementation of OGCNN in Julia and help design the overall framework which would serve as the backbone of the Chemellia
ecosystem.
“‘Physics-informed’ how?”, you ask?
The model I outlined in my proposal would initialize weights using data from the Slater-Koster tables, ensuring that the features would, in essence, be in line with the principles of Tight-binding model formalism.
However, this wasn’t the only glorious purpose of my proposal. Additionally, the primary goals also included creating a better framework overall, one that was very flexible, highly scalable, and easily usable at the very least.
Okay, show me the code
I’m not going to explain every commit and PR I made. I’ve done that in the commit messages and conversations on PRs already.
You can find all the contributions I made during GSoC'21 towards ChemistryFeaturization.jl
and AtomicGraphNets.jl
here and here respectively.
I actively tracked down issues, reviewed PRs, and took part in discussions across both the repositories, too.
In short, I did plenty of things - from debugging CI failures to actively participating in the “big picture” design decisions for the framework itself. The bulk of my contributions involved cleaning up and restructuring the entire ChemistryFeaturization.jl
package, adding tests and documentation in the process and sketching and developing the concrete representation for the featurization required by the model proposed. This work was critical and not always easy.
I managed to complete a good deal of my proposal, a lot more extra work, but not all of it. Some critical parts are still a work in progress.
One of these few things is developing a concrete Featurization
scheme that also models the traits appropriately in line with the data in the SK-Tables. Configuring the layers of the original model to use this featurization scheme (which would essentially derive elements from AtomicGraphNets.jl
too) is also yet to be completed.
Nonetheless, I plan to continue working on Chemellia
and contributing a lot more in the future.
TLDR;
I like Julia. I was a GSoC'21 student under The Julia Language, contributing to Chemellia
.
I like Chemellia
. I plan on continuing to work on it with my mentor, Rachel, post GSoC'21 as well.
You should check out Chemellia
too!