“The most exciting part about PLINY is that it brings together two fundamental areas of computer science, programming languages and machine learning,” said CS research scientist Vijayaraghavan (Vijay) Murali. “Swarat Chaudhuri had been presenting papers on programming languages at some of the same conferences where I was presenting, but I had not considered a collaboration until a mutual friend suggested it.”
Murali was completing a Ph.D. in Singapore and looking for post-doctoral opportunities in the United States, when a friend mentioned the work of the associate professor of CS at Rice University.
Murali was excited to learn of PLINY, the four-year, $11 million project funded by the U.S. Defense Advanced Research Projects Agency (DARPA) and received by Rice in 2014. The project is aimed at developing “autocomplete” and “autocorrect” code for programmers, much like the software that completes search queries and corrects spelling on Web browsers and smartphones. Murali joined Rice as a post-doc.
“Working on your Ph.D. is like going down a deep well. You pick a topic and go deeper and deeper for four or five years. You are an expert in that area. Graduating is coming up out of that well, and it’s refreshing to be exposed to another area,” Murali said.
He enjoys applying his knowledge of programming language, acquired as a doctoral student, to problems in a new realm. “The wonderful thing about PLINY is that it brings together two fundamental areas in CS, and you can apply the techniques and solutions in one field to solve problems in the other.”
Because Murali had focused on formal methods of programming languages, he felt like a newcomer to machine learning. Software engineering has permeated most industries, so it is essential to the future success of both advanced and developing countries to raise up an adequate workforce of programmers. A major hurdle to developing new programmers is the discovery and application of best practices. PLINY could help overcome the challenge by helping bring an intelligent but untrained workforce up to speed in basic programming roles.
“If you want to make programming accessible to non-programmers,” Murali said, “you need to create quick avenues to success that foster a sense of learning and accomplishment while at the same time delivering solutions.”
By mining huge repositories of software programs in open-source systems like GitHub, the PLINY project can be used to learn automatically how software is being written. “GitHub is a gold mine for us,” Murali said. “Our goal is to mine repositories like this to learn how software is being written in the real world and what the common practices are.”
PLINY could be adapted to learn how specific rules apply to particular domains, such as Android apps. “When new Android developers want to write an app, they can apply PLINY’s knowledge to easily follow the ‘unwritten rules of the real world’ and write better code the first time,” Murali said.
PLINY can also identify bugs as software is developed, and repair errors in the programs. Murali feels a sense of accomplishment because the system he developed to learn how software is being written has already been successfully applied. He ran his system on Android apps to make it automatically learn how they were written, then applied that knowledge to newer Android apps offered on GitHub and installed on his own phone. He found a bug.
“The first milestone moment came when I was able to visualize in the phone app the same error that my system had discovered in the app’s source code. I contacted the developers and they were appreciative. It turned out to be a user-interface bug. Since then we have also found other insidious bugs in real apps, ranging from bluetooth usage to cryptographic encryption.”
For more information on the PLINY project, a collaboration between Rice University, the University of Texas at Austin, the University of Wisconsin-Madison, and GrammaTech, see: http://pliny.rice.edu