Abstract:
In this work, we announce a comprehensive well curated and opensource dataset with millions of samples for pre-college and college level problems in mathematics and science. A preliminary set of results using transformer architectures with character to character encoding is shown. The dataset identifies some challenging problem and invites research on better architecture search.