Should Open-Source Generative AI Concern Early Closed-Source “Leaders” Like OpenAI and Google?

Subscribe To Download This Insight

By Reece Hayden | 2Q 2023 | IN-6944

The first stage of Generative Artificial Intelligence (GAI) deployment was founded on closed-source models with some Application Programming Interface (API) accessibility; however, a new wave of announcements has embraced open sourcing. Hugging Face’s HuggingChat, an open-source chatbot based on OpenLLaMA, is just one such announcement. With more open-source announcements likely to emerge, enterprises and stakeholders must answer the question of which approach will be most commercially and operationally effective—before defining their strategy.

Registered users can unlock up to five pieces of premium content each month.

Log in or register to unlock this Insight.

 

Open-Source Initiatives Like HuggingChat Will Create Significant Competition for Closed-Source Alternatives

NEWS


OpenAI’s ChatGPT is based on a closed Large Language Model (LLM) (GPT-3, 3.5, or 4), meaning that consumers cannot see or change the underlying model (only accessing it through an Application Programming Interface (API)), whereas there has been plenty of recent activity in the Generative Artificial Intelligence (GAI) market with a focus on open-source technology (any program with source code made available for use or modification by users or other developers):

  • Hugging Face announced HuggingChat, an open-source chatbot based on Open Assistant (the customization of Meta’s foundation LLM model, LLaMA; this is just one example of increased developer engagement with Meta AI’s leaked LLM). This application aims to further democratize chatbots, in direct competition with OpenAI’s ChatGPT. HuggingChat is built on “a Space” that enables the underlying LLM to be adapted and replaced with other open-source LLMs, meaning that customers can customize and tailor the solution to meet their requirements.
  • Databricks released Dolly 2.0, a fully open-source LLM for research and commercial usage. This open instruction-tuned LLM, based on the EleutherAI model, has been fully open source.
  • Stability AI announced StableLM, a suite of open-source LLMs available for developers to use and adapt on GitHub.

With open-source GAI LLMs and chatbots being deployed, we can expect significant pressure on market incumbents. Enterprises and stakeholders must now evaluate whether open- or closed-source solutions will more effectively mesh with their strategy.

What Is the Value of Open-Source GAI for Enterprises and Market Stakeholders?

IMPACT


Open-sourcing GAI unlocks the underlying model to the public, offering greater data transparency and customizability. But how does this approach create opportunities and challenges for enterprises and market stakeholders?

  • Opportunities:
    • Reduce Vendor Reliance: Closed-source models will quickly lock stakeholders into an ecosystem like the public cloud model, while open-source solutions enable greater model portability. In addition, by opening the underlying model, customers are no longer reliant on central updates/APIs.
    • Lower Barriers to Customization: Closed-source models are inflexible, offering either specific applications like ChatGPT or enabling access through APIs. Open-source models are free to customize and deploy custom applications. Open-source projects can provide the foundation on which stakeholders can innovate to deploy custom solutions without paying for API access.
    • Higher Speed of Innovation on the Bleeding Edge of Technology: By following a community approach, open-source projects have a much higher speed of innovation and development. Any member of the public can alter/tweak the underlying model to improve performance/debug.
    • Greater Transparency and Security: Open-source solutions enable customers to understand the underlying model and the degree to which sensitive information is shared and used for training purposes. This is especially important for financial or intellectual property-heavy verticals concerned that information used to prompt chatbots will be leveraged for model development. Simply, ChatGPT collects user data, while HuggingChat does not, making it the obvious choice for enterprises placing a high value on data privacy.
    • Lower Hardware Resource Requirements: Open sourcing allows end users to contextualize the foundational model (LLM) to their use case by, for example, reducing the number of parameters or providing more effective training data, helping lower the workload burden on hardware resources.
  • Challenges:
    • In-House Developmental Expertise: Open-source projects rely on enterprise or stakeholder DevOps expertise to package GAI solutions into usable applications/services. This is time/resource-intensive and relies on a skill set to which not all stakeholders will have access.
    • Open-Source Projects Lack Resources to Optimize Performance: As LLMs move away from billions of parameters, optimizing the models to maximize the value from data is essential. Open-source models may lack the resources to sufficiently optimize performance.
    • Significant Vulnerability, Given Public Access to Underlying Models: Stakeholders will only become more wary about the risk of enterprise GAI. Open-source models enable complete transparency, meaning that anyone can see vulnerabilities, creating greater risk for end-user data/processes.
    • Fragmentation: At the project level, open-source solutions can quickly fragment, given the reliance on a community-centric software development process founded on freedom. This can increase security risks, costs, and complexity, while reducing the pace of innovation.

What Should Stakeholders Consider When Defining Their GAI Strategy?

RECOMMENDATIONS


System Integrators (SIs), Managed Services Providers (MSPs), Independent Software Vendors (ISVs), and Value-Added Resellers (VARs) will want to play a commercial role in GAI by bridging the gap between providers (across modalities) and enterprises. We have already seen activity with the partnership between OpenAI, Siemens, and Bain, but more will likely emerge. Before they implement their commercial strategy, they must choose whether they wish to embrace an open- or closed-source approach. ABI Research recommends that stakeholders answer the following questions before they make this decision:

  • What capabilities do we have in-house? Leveraging open-source software requires deep domain expertise that may not be available given the early stage of the GAI commercial market, as well as sufficient human capital that can customize and develop based on the open-source foundation.
  • Who are our current customers and where do we see key use cases? Performance, data privacy, and other factors will all be important, but customers put emphasis on different factors. For example, financial and healthcare enterprises will put emphasis on data privacy, while manufacturing players may care more about performance and productivity. Understanding who you are selling to will help define your strategy.
  • Which strategy will create the highest Return on Investment (ROI)? Leveraging closed solutions will require stakeholders to pay third parties for API access, but minimizes developer costs. While open-source solutions are free at the point of consumption, they require significant developer costs to productize, customize, optimize, integrate, maintain, and upgrade.
  • What type and level of risk are we willing to take on? Commercializing a GAI solution will come with risk no matter which strategy you choose; however, the level and type of risk varies. Open source comes with a high level of risk associated with in-house development/maintenance, while closed source means that stakeholders are reliant on a hyperscaler’s walled-off development and strategy. This is especially important given that stakeholders are then reliant on the hyperscaler’s data privacy policies, which will likely be central to enterprise decision-making.
  • What deployment strategy do we want to deploy at the corporate level? It is important for stakeholders to assess how they want to productize GAI. A domain-specific approach requires open-source models to create highly-customized solutions for customers. However, this approach comes with higher price tags and longer time frames, which may not resonate with customers. While a general approach, leveraging APIs to access closed-source models will be quicker to deploy and come with relatively low price tags, but will not be optimized for customer use cases.

Although open-source solutions have a strong value proposition for enterprises, especially given that HuggingChat has placed data privacy front and center, it is likely that ChatGPT and other closed solutions will be favored. The reason for this is the lack of relevant skills, time, and money necessary to maximize the opportunity presented by open-source models. Hyperscalers/hardware/software vendors have, so far, cornered much of the GAI skills market with little left for market stakeholder partners. In addition, the cost of filling this skills gap would be prohibitively high, meaning that it is more cost efficient to pay for third-party APIs in closed-source models.

 

Services

Companies Mentioned