<p>In the CheRMiT notebook, students were guided through the initial setup of the cheminformatics segment of our project, which focuses on verifying if a found set of reactants and products from a sentence is actually capable of having a reaction occur between them based on cheminformatics software. Students were walked through a new cheminformatics library, RDKit, to pivot away from old Java-based libraries that the team no longer wished to use. They were then instructed to program their own reaction validator, which could apply a reaction operator (an abstract chemical operator that can take in a chemical and apply the modifications to the reactant that represent the reaction taking place, and output the modified chemicals, or the product of the reaction). </p>
<p>In the CheRMiT notebook, students were guided through the initial setup of the cheminformatics segment of our project, which focuses on verifying if a found set of reactants and products from a sentence is actually capable of having a reaction occur between them based on cheminformatics software. Students were walked through a new cheminformatics library, RDKit, to pivot away from old Java-based libraries that the team no longer wished to use. They were then instructed to program their own reaction validator, which could apply a reaction operator (an abstract chemical operator that can take in a chemical and apply the modifications to the reactant that represent the reaction taking place, and output the modified chemicals, or the product of the reaction).
<p>Upon outlining their code and programming an initial approach, they could then test their implementation by passing a candidate set of reactant-product pairs, as well as a canonical list of reaction operators; students were able to test their implementations and refine these implementations to improve recall, while also identifying concerns with missed reactions or the limitations of the chemical processing functionality that would be useful to address in downstream, mature software development for the project. This holistic validation and varied student approaches invited a wealth of analyses that ended up accelerating the development of the cheminformatics validation software for the project this semester, to the point where a more advanced infrastructure has been rewritten, fleshed out, and is now in the process of getting thoroughly benchmarked based on inputs from language model outputs. </p>