This post is mainly a cathartic rant about my career frustrations. If this post was just a litany of complaints, then I’d keep it to myself. However, it also includes some constructive brainstorming about the various plans that I’ve considered over the last several months to resolve some of my career problems. While I’ve concluded that most of these plans are too unlikely to succeed to take seriously, the brainstorming process has refined my outlook on a lot of issues, and I felt that this might be worth sharing here on my blog. Ultimately, I’ve decided on some specific short-term and long-term plans from this set of possibilities, although I’m not quite 100% committed to them just yet.

First, let me articulate my vision for what my ideal research career would look like. While I might want complete freedom and autonomy to pursue basic research on whatever topic I want, whenever I want, for as long as I want, I don’t think that is feasible target to desire or pursue. Except for independently wealthy scientists (which doesn’t include me), science always has a patron, whether it be public (a government) or private (a company), and I certainly believe that scientists should be advancing the causes of their patrons with applied research alongside their pursuit of more basic research. The more useful science is able to be, the more it can justify increased levels of investment and increased tolerance for risk. I would be quite content with a research portfolio that is 50% basic and 50% applied, especially if there were natural, synergistic connections between basic work that I chose to pursue and applied work that was a bit more of an obligation. I would also like to be part of a research community that I felt a sense of belonging to, with frequent substantive technical interactions that occasionally produced collaborative projects. Finally, I’d like some sense of long-term career stability so that I’m not constantly worried about how to keep my research career alive for another few years.

While I continue to have some semblance of a research career, it is far from my ideal. At my present job, I have a steady 20% of my time to pursue basic research interests. This is substantially better than my last job (Sandia), where I had no baseline support for my research interests, and my average level of support over 7 years was around 10%, although there were several years when it was just zero. I have failed to achieve research synergy in all of my jobs so far. Institutionally, Sandia was never even remotely interested in any of my research, and I was mainly there as highly skilled labor for other people’s research projects. At my present job, there is some correlation between the software development projects that people are assigned to and their underlying skills and interests, but we intentionally keep our personal research disconnected from our community-centric software development to avoid the appearance of bias. As for community, I have never felt like I’ve belonged anywhere, and my sense of isolation only continues to grow with time. I have never had a sustained, recurrent, and reciprocal research interaction with anyone, and I probably never will at this point. There is almost no overlap between the problems that I find technically interesting and what other scientists find technically interesting. Finally, I’ve had some career stability on paper, but not in practice. My job at Sandia was as permanent as I wanted it to be, but I was displaced from my intended field of research after only a few years, and the pressure to move yet again over to classified projects eventually started to ratchet up a few years later. Diversifying my research was something I could tolerate, but a shift to classified work is very often an irreversible career choice, and I began to feel very trapped. While I am happier at my present job, the institute that I work for is temporary - it has to win a 1-time grant renewal in a few more years to continue operations, and it has to become financially self-sustaining in 5 more years after that or else it will cease to be.

I certainly accept some of the blame for my career misfortunes. I do not publish papers frequently enough, I have not focused my research interests enough, and I have not engaged in enough self-promotion. Two friends of mine from graduate school who were in the same group and started at the same time as me have a Google Scholar h-indices of 29 and 43, whereas my h-index is 12. They each have a well-defined primary research topic, whereas I’m just all over the place with my research projects. They are now tenure-track professors, and I failed to get any in-person job interviews after applying to several hundred tenure-track positions over the course of 5 years. Part of this stems from my early misunderstandings of interdisciplinarity. Interdisciplinary research is often touted as a great virtue in science, but the virtue is specifically in researchers collaborating across disciplinary boundaries and very much not in having researchers wander between disciplines without regard to disciplinary boundaries. Over the years, I have enjoyed pursuing research over a relatively wide variety of topics, but my “interdisciplinary” research interests have brought me to a point where I don’t feel welcome anywhere.

However, I am not willing to accept all of the blame for my career misfortunes. I also think there are systematic problems, both broadly throughout scientific research and narrowly related to some of my interests. The broader problems are relatively well known. There is an excellent book, How Economics Shapes Science, that presents a lot of data and analysis on how different funding levels and strategies affect the culture of scientific research and the distribution of career outcomes. I highly recommend the book to anyone interested in this topic, but I will not attempt to summarize it in any meaningful way here. My distillation of its contents and related information from other sources is that government-funded basic research in the United States is too heavily invested into growth. Basic research funds are primarily directed to professors for student-driven research. This investment focus is very good for rapidly expanding the labor pool of scientists, and gives scientists a generic, useful purpose regardless of the usefulness of their research - the job of training new scientists. However, this system drives the training of new scientists faster than they can be employed within basic research, and any equilibrium requires a very high rate of attrition. While the overall US investment into research and development as a fraction of GDP has remained relatively steady within the 2%-3% range for over 50 years, there has been a steady shift of that funding from the US government, which supports a mix of basic and applied research, and private businesses, which predominantly supports applied research. I suspect that this shift has been inevitable and will continue until an equilibration - private businesses do not invest much in the training of new scientists, so the shift reduces the overall rate at which new scientists are trained, and it will hopefully, eventually match a more natural rate of career attrition (retirements of older scientists rather than younger scientists being forced into non-research jobs). Of course, this also has many obvious problems - not all fields of science are directly and immediately useful for private businesses, so this is likely to filter career outcomes of fields by their short-term utility rather than possible long-term impacts, and it pushes more activity from basic research to applied research. Applied research is less likely to be published and less broadly useful, and a recent study has shown that applied research outcomes (e.g. patents) heavily depend on past basic research outcomes (e.g. scientific publications). Thus, in eroding our present capacity for basic research, we are eroding our future capacity for applied research. In my view, a “simple” solution to this problem is to guarantee that scientists be able to continue basic research work with some fraction of their time when employed by private businesses to conduct mostly applied research. Of course, achieving this solution will not be easy, and it will probably require some combination of legislation and collective labor actions among scientists.

In addition to these broader scientific problems, my core interest of atomistic simulation methodology has had its own unique problems over the last decade or so. Electronic structure theory has traditionally been a subdiscipline of physics, but its popularity within physics peaked in the 1990’s and early 2000’s. By the time I graduated in 2008 (into a recession, no less), electronic structure theorists were being pushed into materials science departments as the hiring in physics began to shift to more trendy topics (e.g. topological phenomena). However, the hiring in materials science was mostly dominated by applications-driven research - they were much more interested in people applying established theoretical and computational tools than people trying to develop new tools. This reflects on a broader trend that career outcomes are generally more poor for people working on method and software development than their applications, since those activities produce fewer papers and dwell on time-consuming activities like programming that do not directly benefit scientific careers. There has been a recent push in the US to invest more into research cyberinfrastructure, and the institute that I work for is one small piece of that investment focused on computational chemistry. However, my institute draws a sharp distinction between method and software development, and our focus is strictly on software. We serve the methodology that exists and that’s it. Methodology research itself is an unforgiving, high-risk slog, and it largely manages to survive by clustering around specific paradigms and ossifying into something more tangible and predictable. In computational chemistry, the two enduring paradigms are classical treatments of molecular dynamics using interatomic potentials and Gaussian-based calculations of electronic structure, either using density functional theory (DFT) or quantum many-body methods. As I’ve discussed multiple times in this blog, there used to be a third paradigm of semiempirical modeling between these two that was able to achieve some novel compromises between cost and accuracy, but its popularity collapsed in the early 1990’s - it no longer has an appreciable developer community, it no longer trains students, and it no longer attracts research investments. Unfortunately, my institute was envisioned to serve the paradigms that persist rather than to revitalize nascent paradigms or create new ones, and it is difficult to pursue methodology research without the backing of an established research community. This methodological polarization in computational chemistry can even be seen in the private research investments of billionaires - D. E. Shaw Research supports classical molecular dynamics method development and The Flatiron Institute supports electronic structure method development.

Even traditional electronic structure methodology research is now under increasing pressure because two popular research topics - machine learning and quantum computing - are applicable to electronic structure. Many people have given up on developing faster algorithms for electronic structure, and they are instead using machine learning to interpolate new results from enormous amounts of precomputed reference data. There is more support to develop quantum algorithms for electronic structure simulations on hardware that won’t exist for several decades than there is to develop genuinely new classical algorithms that work on our present classical computers. It may soon become almost impossible to get funding in electronic structure methodology research unless it is based on either machine learning or quantum computing somehow. While there is widespread optimism that both machine learning and quantum computing will revolutionize electronic structure simulations, I am extremely skeptical and pessimistic about their practical utility. Machine-learning models simply require too much reference data to attain the level of accuracy, reliability, and broad applicability that we expect from electronic structure simulations. Quantum computing enables electronic structure algorithms with better asymptotic scaling, but their cost prefactors will inevitably limit their application severely. This has happened before, and it will happen again - there was widespread optimism in the 1990’s about linear-scaling electronic structure algorithms, but their cost prefactors are still too high for them to actually be cheaper for the routine simulations that people can afford to do.

Career concerns are certainly not new for me, I started to realize towards the end of graduate school that it was difficult to plan a long-term career trajectory, but I just assumed that it would all work out somehow along the way. Unfortunately, it hasn’t worked out. I learned first-hand how narrow and limited the basic research investments are at the US National Labs, and I’ve failed enough times at applying to tenure-track faculty positions to know that it isn’t a realistic career option for me. I have not met any like-minded people along the way. I’ve had escapist fantasies about developing some of my methodology research into commercial software instead of publishing it to gain financial independence and maximum freedom to pursue my research interests, but I know full well that commercial successes of this sort are even more elusive than academic successes. However, while commercial software development isn’t likely to succeed for me, it is at least a direction and an activity other than writing job and grant applications that only ever get rejected and writing papers that only ever get ignored. I do strongly believe that the best science is open and the best scientific software is open source, but someone ultimately has to pay for these things. When funding gets tighter, scientists hold back more knowledge to maintain advantages over their peers in the fierce competition for scarce resources.

Over the last several months, I’ve thought a lot about how to steer my career in a more favorable direction and regain more of a sense of control over my future. A common feature of all the plans that I considered was a consolidation of my research into a narrow, coherent path. As I noted in my last blog post, I have 4 projects/papers that I’m in the process of finishing right now, and those account for all of my research interests that aren’t already on indefinite hold. From there, my research will resume down a more narrow path. My ultimate goal remains the same as when I started this blog - to develop a new generation of semiempirical models - but now I plan to develop them as a Software-as-a-Service product, because otherwise their ongoing development will not be financially sustainable (as research funding in this area is hopeless right now). As I’ve previously stated in past posts, there are 3 technical pillars of this effort - correlation models, basis sets, and solver algorithms. Of these pillars, correlation models generate the most academic interest and have the most technical uncertainty, and my open research efforts and open-source software development will narrowly focus on correlation models. While innovations in basis sets and solver algorithms are equally necessary to improve semiempirical modeling, there is very little academic interest in these topics, and I plan to keep my development of them proprietary in the absence of sustained research support. This plan will help me to build a better academic reputation by focusing on a popular research topic - correlation modeling - that has ties to two other popular research topics - machine learning and quantum computing. All the while, I will also be developing the other necessary technical components of new semiempirical models in silence, since there really isn’t anyone to discuss them with anyways (I’ll still summarize their progress in this blog, at least).

In the remainder of this post, I’ll discuss some of the different paths that I considered while deciding on my present course of action and how I plan to proceed when my last few ongoing papers are finished.

Possible scientific software projects to focus on

My first consideration was about the research projects that I could plausibly develop into useful software on my own. For this endeavor to be worthwhile from a career perspective, the software must either be of enough academic interest to sustain an academic career or useful enough to enough people to be a viable commercial product that I can make a living off of. While I have developed many small software projects over the years and made minor contributions to a few larger projects, I have never seriously pursued my own large project intended for widespread use for a variety of reasons. First, I’ve never had institutional support to do so at any of my jobs, and it would be hard to sustain sufficient effort as just a hobby. Second, most of my methodological pursuits still have too much technical uncertainty to be worth committing to. Third, I have have strong, recurring doubts that this would actually help my research career even if everything went well technically. Only the second reason is a technical concern, and it is sufficient to narrow the list of possible projects down to three.

Dense Hermitian eigensolver

This is the project with the least technical uncertainty, but also the one that I am most certain is not worth the effort. It targets a growing problem in scientific computing, but the effort required to solve the problem substantially exceeds what anyone is willing to support. Numerical linear algebra involving dense matrices is a very mature subject with underlying theory that was mostly developed over 50 years ago, and mature implementations in open-source libraries: elementary operations in the Basic Linear Algebra Subroutines (BLAS) that are used to construct solvers in the Linear Algebra PACKage (LAPACK). BLAS routines are divided into 3 levels corresponding to vector operations, matrix-vector operations, and matrix-matrix operations. For performance reasons, it is essential for the computational bottlenecks of LAPACK solvers to be implemented entirely using level-3 BLAS operations (matrix-matrix), which are able to hide the growing imbalance between communication and computation costs on modern computers. The most extreme case right now is the cost ratio of a CPU-GPU communication operation to a GPU floating-point operation, which is around 1000-to-1. To hide this imbalance and approach peak floating-point performance, solvers need to apply level-3 BLAS operations on submatrices (“blocks”) of dimension close to 1000. LAPACK solvers can be roughly grouped into two types - linear-system-like solvers and eigenvalue-like solvers. Level-3 BLAS implementations of linear system solvers are very old, they were even available in LINPACK, one of the precursors of LAPACK that is still used to rank supercomputers in the Top500 List. Unfortunately, eigensolvers do not have fully level-3 BLAS implementations, and their performance has been steadily degrading compared to linear system solvers in both shared-memory and distributed-memory settings.

The problem with modern eigensolvers is that they require the orthogonal transformation of dense matrices to tridiagonal matrices so that tridiagonal eigensolvers can be used to solve the problem. It is impossible to implement this transformation with only large-block level-3 BLAS operations. In recent years, there have been two attempts to circumvent this problem. The ELPA library splits the transformation into two steps using an intermediate banded matrix, and the dense-to-banded step can be implemented with level-3 BLAS operations. This doesn’t eliminate the memory bottleneck, but it reduces its overall impact. The EigenExa library transforms to a pentadiagonal matrix instead of a tridiagonal matrix and uses a novel pentadiagonal eigensolver. This transformation still can’t utilize large-block level-3 BLAS operations, but it also reduces the size of the memory bottleneck somewhat. Both of these approaches are avoiding the obvious solution of transforming to a banded matrix with a large bandwidth and using a banded eigensolver because a fast and reliable large-bandwidth banded eigensolver doesn’t exist. The SLATE library that is refactoring dense linear algebra software for exascale supercomputers plans to incorporate the same approach that ELPA uses and will have the same performance limitations.

I’ve been working on the technical components of a banded eigensolver for a long time, spread out over several projects such as the Cauchy kernel project that I finally finished in late 2019. These efforts have made extremely slow progress because they have had essentially no overlap with any technical work that I have been paid to do. I need a lot of sustained focus to make progress, which is impossible when I have other work to do. Everything has come together to the point where I can see a roughly 1-year technical path to a proof-of-principle prototype algorithm, but the only way I’ll ever have a solid year to focus on this is if I quit my job. This just isn’t something that physicists or chemists care very much about, so to have any academic value I would have to promote myself among mathematicians, which I haven’t been very successful at in the past. This also isn’t something that can be realistically commercialized because people expect dense linear algebra software to be free like BLAS and LAPACK. The only example of a commercial numerical linear algebra library that I am aware of is from the Numerical Algorithms Group, which suggests a very tiny commercial market that is unable to sustain much commercial activity. I would also be putting myself on the hook to make the eigensolver highly performant in a wide variety of computing environments, which requires substantially more effort than simply implementing a working prototype.

Quantum many-body solver

This project is the further development of the new finite-temperature variational quantum Monte Carlo method that I discussed in my previous post. The basic theoretical foundation of this method has already been developed, and preliminary numerical tests have been surprisingly successful. The detail that I was worried the most about - the tightness of entropy bounds - actually caused me the least amount of trouble, although it might be because my test problem was too easy. This has the potential to be a broadly useful many-body simulation tool - capable of simulating both fermions and bosons and capable of both equilibrium sampling and non-equilibrium time evolution - but a lot more development work is still needed to fill in the details of each of these cases.

This project would have a large amount of academic value because quantum many-body problems are still of great academic interest in physics and chemistry and its technical approach shares common ground with quantum computing and machine learning research. I can try to exploit those underlying connections to promote this work in academic circles. The closest available software right now is NetKet, which seems to be doing quite well academically, since its development is being supported by the deep pockets of the Flatiron Institute. My approach has the differentiating features of being able to simulate finite temperatures in a variational manner and exclusively utilizing direct sampling so as to avoid the inefficiencies of Markov-chain Monte Carlo sampling. I’m also going to avoid the use of established machine-learning models and focus on more physically motivated cluster expansions, which I believe will be more reliable and effective when these methods are pushed to very high accuracy.

Quantum many-body solvers are not really used outside of academic circles, partly because none of them are very reliable yet and partly because the problems that they are needed to solve have limited practical relevance. Perhaps the most promising commercial use would be to mimic noisy quantum computing devices, which are themselves starting to be packaged as on-demand cloud computing resources. If they start to command a high price and my solver can mimic their outputs reliably, then I could claim to offer an equivalent computational capability and likely be able to offer it at a lower price. However, this scenario is largely self-defeating because the broader interest in these noisy quantum computers is contingent on them having a computational advantage over classical computers for some tasks. If it becomes clear that their noise makes them efficient to simulate on classical computers, which there is growing evidence for, then such a market would quickly dry up as the hype died down.

Semiempirical electronic structure

I went through a lot of brainstorming about semiempirical electronic structure in 2019, some of which was mentioned on this blog. While there remains a lot of technical uncertainty overall, I was able to bifurcate my plans into a low-uncertainty short-term plan and a high-uncertainty long-term plan. The success of the short-term plan might inform the development of the long-term plan, but they otherwise do not have much direct technical overlap. The essence of the short-term plan would be to fix the largest technical problems in existing semiempirical models and rigidly couple them to low-cost, low-accuracy linear-scaling solver algorithms. These technical problems are the severe approximations of 4-center Coulomb integrals, the restriction to a minimal basis, and the independence of dispersion parameters. Much less severe approximations could be used to maintain a low cost of 4-center Coulomb integrals while preserving a lot more accuracy, an effective double-zeta-polarized basis set could be introduced with almost no added cost by carefully controlling the pattern of non-zero Hamiltonian matrix elements, and the low-energy-scale dispersion parameters could be effectively derived from other easier-to-fit high-energy-scale parameters by using more physical models for dispersion interactions. While these innovations could substantially improve on existing semiempirical models, they will quickly hit a technical wall in how much they can improve accuracy.

The academic value of semiempirical modeling is very low. Semiempirical approaches to DFT out-competed the more traditional approaches to semiempirical modeling in the early 1990’s, just as the main developer of semiempirical models throughout the 1970’s and 1980’s - Michael Dewar - was retiring and attempting (largely unsuccessfully) to transfer and continue his semiempirical research program through younger collaborators. One of Dewar’s former collaborators - James Stewart - had forked this research program and software development in the early 1980’s into what would become MOPAC, which is now the last bastion of traditional semiempirical modeling. However, the development of MOPAC was never academically successful. Instead, it has persisted for over 3 decades as a mostly commercial effort, although versions of it have always been freely available for academic use. DFT was itself the inspiration for a new approach to semiempirical modeling in the 1990’s called Density Functional Tight Binding (DFTB). While it has never been too popular, it has maintained a steady albeit modest academic existence and a well-maintained open-source implementation called DFTB+. The most recent semiempirical models and their open-source implementation in XTB have come from the Grimme group, which have cross-pollinated some of the technical details of older semiempirical models with DFTB models (mostly a higher-order inter-atomic multipole expansion for electrostatics) and provided some new parameter sets with wide coverage of the periodic table and compatibility with the most recent dispersion models developed by Grimme group (it is most well known for DFT-compatible dispersion models). Realistically, this means that academic activity in semiempirical modeling is at subsistence levels - there is just enough activity to propagate old ideas into the development of new models but little room for more exploratory innovation.

The overall market for commercial atomistic simulation software is not enormous, but it large enough to warrant market analysis and continues to grow as atomistic modeling finds more industrial applications. While semiempirical models are a small market segment of this overall market, there are obvious growth opportunities if either (1) linear-scaling implementations can be reduced in cost enough to compete with high-end interatomic potentials for classical molecular dynamics applications or (2) new semiempirical models can be improved enough in accuracy to be competitive with low-end DFT calculations. What exists right isn’t quite up to either of these challenges, but they are both realistic targets. Given the low academic value of semiempirical modeling, it is actually counterproductive for new models to be specified openly in the scientific literature and they are likely to be more successful if they remain proprietary. For example, in my very first blog post I had presented evidence from publication trends that many more people used the very popular, multi-purpose GAUSSIAN software to perform semiempirical calculations than the special-purpose MOPAC software using the same semiempirical models that were initially developed for MOPAC and eventually replicated in GAUSSIAN from their open specifications in the scientific literature. This is a market dominated by a small number of very popular commercial and quasi-commercial efforts (e.g. VASP, GAUSSIAN, AMBER, CHARMM) that are active enough to implement new features from their literature specifications if there is sufficient user demand. Thus, having a chance at penetrating such a market requires a competitive advantage such as proprietary features that have no open specification and would require substantial reverse-engineering to implement independently. There is already some precedent for this - the QUASINANO2013 parameter set for DFTB that has large coverage of the periodic table is proprietary and only available in SCM’s commercial implementation of DFTB.

Possible business models for scientific software

The other axis of consideration was possible business models for software, if I get to the point where I have something that is worth selling and there aren’t good alternative research career options. In looking at past commercial successes of atomistic simulation software and other scientific computing software, it is clear that academic success has always precluded commercial success. Basic research in an academic setting is necessary to buffer the uncertainty in developing this kind of software, so that it can develop into a more stable, well-defined entity without onerous amounts of time and budgetary pressure. Often, pursuing commercial development opens up more opportunities for research funding to bridge the gaps between research software and commercial software, such as through the US Small Business Innovation Research (SBIR) program. Adjusted for inflation to 2020 US dollars, the SBIR program has awarded $7 million to Schrodinger, Inc., $19 million to Q-Chem, Inc., and $3 million to MOPAC as several notable examples. Thus, research software that was initially developed using support from basic research grants has been able to smoothly transition to commercial software with further US government support. Research projects that fail to attract funding within the arena of basic research are also likely to be on their own in any attempts to develop into commercial software products. To be truly successful as a commercial product, commercialized software must also be able to generate a profit, otherwise it is just gaming the system to attract more development funds. I’ve considered 3 different business models in imagining a future business.

Traditional software licenses

Historically, commercial atomistic simulation software has been monetized through the sale of software licenses, either for a one-time cost or with recurring maintenance/support fees. Because the market for this software is small, estimated to be a few tens of thousands of scientists, the price of these licenses usually ranges from hundreds of dollars to tens of thousands of dollars, typically with lower prices offered to academics and higher prices offered to industrial users. Lower prices are not commercially viable without a larger market. These high prices are yet another reason why the market is dominated by software built on strong academic reputations - people need a lot of trust in the science underpinning the software to be willing to pay such a high price for it. Younger scientists and newer software trying to enter into this commercial market are at a severe disadvantage, and have to work very hard on the academic side to build up a competitive reputation to have a chance at selling software for sustainable prices.

Indebted open-source licenses

There has been a big push in recent years for science to be more open and reproducible, and a corresponding push for scientific software to be open source. In my own research, I have been following these standards for about 8 years now, and all the papers I have written in that time have been accompanied by whatever custom software was used to generate their results. Ideally, I would like all of my software development to be open source, but the projects that I mentioned above are not supported by my current job and I have yet to find a job that will support any of them. How to pay for open-source software development is an existential question that frequently comes up in the open-source software community, and it doesn’t have a generic, universal answer. Financially self-sufficient open-source projects either serve a very large user base (e.g. internet infrastructure software) or corporate customers that are willing to pay for ongoing support and/or customization. Neither of these scenarios is applicable to the scientific software that results from basic research. Academic open-source software is funded by temporary research grants and written by temporary labor (grad students and postdocs), and eventually the money and labor disappear. More successful projects are able to maintain a series of multiple grants to continue their development and transfer development between cohorts of grad students and postdocs. Most projects simply become permanently inactive - their openness allows them to remain available for reproducibility and posterity, but they cease to have a future beyond that.

I spent some time imagining how open-source software could become more financially sustainable, and I came up with an interesting concept that seems reasonable on paper but probably won’t work in practice for scientific software. The basic premise would be to release software under an “indebted” license that reverts to a more permissive open-source license once a stated overall debt had been paid or enough time had elapsed for the debt to “default”. While the license is indebted, users would be obligated to state how much they are willing to pay off the debt (zero being an acceptable amount), and developers would be obligated to report on how much the debt has been paid off. The indebted software could be used in downstream open-source projects, which would then inherit their indebted license provisions. This could lead to some interesting effects such as an obscure piece of indebted software being used by a popular open-source project, which then quickly pays off its debt to open the license back up again. If the debt is stated as a single large sum, then that will probably be quite intimidating to potential donors. This model would probably work best if the debt is slowly incremented as features are added to software. Unfortunately, I just don’t think people are generous enough for this idea to work. There would still be plenty of software with very few users and unpaid debts in which the developers invested a lot of time for no compensation. It is just difficult to match support with need, and promoting an open-source software development culture where needs are explicitly articulated through indebted licenses might help. There would be a lot more open-source software development if it were easier to turn into a reliable job rather than just a hobby.

Software as a Service

The final business model that I considered was the Software-as-a-Service (SaaS) model. This is a very new business model, facilitated mainly by the recent commoditization of computing through cloud computing services. For compute-intensive tasks like atomistic simulations, a SaaS product wouldn’t be realistic if cloud computing costs are substantially higher than maintaining your own hardware. This has been true for many years while AWS maintained a near-monopoly on cloud computing, but recent competition from Microsoft and Google have driven cloud computing prices down to more reasonable levels in the last few years. There are a lot of remarkable SaaS success stories, but a common feature of these successes is relatively simple software with short development times targeting a well-chosen niche. An important feature of this model is the incrementalization of costs, whereby a relatively modest amount of money can be charged to use a program just a few times rather than requiring a user to buy an expensive license. There is a tangible ongoing benefit to user retention (it directly impacts the continuous revenue stream), and revenue is more uniform rather than spiking when new versions of software commanding new licensing fees are released. By removing a large upfront cost, it is easier for a SaaS-based product to attract curious new users by not requiring them to make a large initial financial commitment. This can possibly change the dynamic of first having to build up an academic reputation before being able to market a high-cost software product into one where the academic reputation is built while developing and expanding the capabilities of a SaaS product.

The Software-as-a-Service business model is not incompatible with having open-source software, but the choice of what is open and what remains closed will strongly impact commercial viability. If you just create some fully open-source software and then package it up as a SaaS product, then there is nothing stopping someone else from also packaging it up as a SaaS product and charging less money for it. Something like this hasn’t happened yet with scientific software, but it has been happening with open-source database software over the last few years. With scientific software, I believe that the trick is to hold back some “secret sauce” in the SaaS product that differentiates it from the purely open-source software component while making the more academically interesting parts of the code as open as possible. Such proprietary components might be ease-of-use features such as automatic fine-tuning of parameters for non-expert users or optimized implementations with higher performance or adapted to specialized computing hardware (e.g. GPUs).

The path from a short-term plan to a long-term plan

I’m a big fan of planning, and I frequently make research plans that just don’t survive for very long under the uncertainty of research activity. I believe that the trick to improving the survivability of research plans is to have more definite short-term plans and more ambiguous long-term plans that are only clarified as more information is gathered by achieving (or failing to achieve) the short-term goals. With my newfound open research focus on correlation modeling, the short-term goal is to develop my variational Monte Carlo (VMC) method into the best possible high-cost, high-accuracy quantum many-body solver that it can be, including an open-source software implementation. While the present version of my VMC paper is just the basic theory and preliminary numerical results, the full version will contain a broader vision of how I plan to develop the capability. Once the software has reached a minimum viable stage of development, I will aggressively promote this simulation capability in paper, at conferences, and on social media.

On the software/business side, I will probably try to learn how to build a SaaS product by implementing cloud-based calculations as an alternative to local calculations and setting up all the relevant payment processing infrastructure. Cloud-based Monte Carlo simulations are enticing in that independent samples can be all computed simultaneous without affecting the overall monetary cost of the simulation, or at least they can be parallelized until the overhead of initializing a container or the granularity of cloud computing costs become a noticeable contribution to the overall cost. While I don’t expect this to be even remotely profitable, it will be an excellent excuse to learn the skills that I will eventually need to develop SaaS-based semiempirical electronic structure methods.

While high-cost, high-accuracy quantum many-body solvers are not directly relevant to the development of semiempirical electronic structure methods, they will be essential for generating reliable reference data with which to test new low-cost correlation models. Limitations in our ability to generate or gather reliable reference data is a significant problem when developing correlation models - we often end up testing them against the limited, skewed data that is available rather than data that more accurately reflects their intended function. For example, exact many-body calculations are only viable for very small system sizes or systems with very special structure. Certain types of very desireable reference data such as potential energy surfaces for bond breaking within a moderately-sized molecule are just extremely difficult to generate with the simulation tools that are widely available right now. Also, I plan to develop some variant of the random-phase approximation (RPA) as my eventual low-cost correlation model, and I’d like to be able to compare it to more conventional coupled-cluster calculations in cases where they are both likely to do poorly (e.g. in the presence of strong static correlation), which again requires a reliable reference method.

The transition from short-term goals to long-term goals will occur as the VMC method and software matures and I begin to test RPA-based correlation models against it. These low-cost correlation models will also be developed openly and with an open-source software implementation. However, at this time I will also initiate the development of new linear-scaling solver algorithms and new semiempirical Hamiltonian models, the details of which will remain proprietary unless I can find a reliable patron for semiempirical electronic structure research. I will continue to discuss high-level information about this proprietary development - enough to articulate its progress and capabilities, but not enough to replicate it precisely.

Just as appreciating the value of maximal openness in basic research and open-source software development helped me to become a better scientist almost a decade ago, I now appreciate the necessity of carefully restricting access to some information to survive when support for basic research is far from sufficient to fund the activities of all capable scientists and competition for scarce resources becomes fierce. I don’t think it is the lack of openness itself that is necessarily bad for science, but rather a lack of clarity and honesty about what information is open and what is closed. I hope that this attitude adjustment helps me to become a more successful scientist in the future, and I will continue to document that journey here in my blog.