Understanding Open Source AI: What Actually Counts as Open?

Here's something that might surprise you: OpenAI doesn't build open AI models. Their GPT and DALL·E offerings? Completely proprietary and closed-source. So what about Meta's Llama? Despite what Mark Zuckerberg says, it's not truly open source either—though it's more open than OpenAI's models, which is saying something.
When we talk about AI models, there are really three main categories:
- Proprietary
- Open source
- Open
These distinctions apply to both large language models (LLMs) and text-to-image generators. The landscape is still forming, and the Open Source Initiative is currently hammering out a strict definition of what qualifies as truly open source AI. Let's break down where things stand right now.
What Does Open Source Actually Mean?
Before diving into open source AI, let's step back and define open source itself. It's not just trendy jargon—the Open Source Initiative maintains a formal definition that outlines the philosophy and core requirements. It's published under the Creative Commons Attribution 4.0 International License, but here's the essence.
Open source doesn't simply mean you can download or access code freely. True open source must be available to anyone who wants to use it, modify it however they see fit, and use it for any purpose. Open source licenses cannot restrict any "field of use"—and this is exactly where many supposedly open source AI models fall short.
The OSI maintains an approved license list, with major ones including Apache 2.0, MIT License, and GNU Public License.
What Are Proprietary AI Models?
Proprietary AI models represent some of the most powerful and widely-used systems available today. Private companies develop and operate them, keeping the source code, training methods, model weights, and even parameter counts under lock and key. You can only access proprietary models through official channels—chatbots, APIs, or applications built on top of these APIs.
Take OpenAI's GPT-4o. We don't know what training data went into it or how many parameters it has. The only way to use it is through ChatGPT, OpenAI's API, or third-party apps like Perplexity or Zapier Chatbots.
And yes, OpenAI charges for access. Want to use GPT-4o—arguably one of the best AI models out there? You're looking at $20 monthly for ChatGPT Plus, API fees, or integrating a paid service. You can't just download GPT-4o and run it on your own servers.
The same applies to virtually every other proprietary AI model:
- GPT-4o mini and DALL·E 3 from OpenAI
- Claude 3 and Claude 3.5 from Anthropic
- Gemini and Imagen 3 from Google
- Command R and R+ from Cohere
- Midjourney
What Is Open Source AI?
Open source AI consists of models released under open source licenses. But here's where it gets complicated: researchers have discovered that many models claiming to be open source actually aren't. This deceptive practice is called "open-washing," and it's created serious confusion—even among AI writers.
Chart showing the "openness" spectrum of various AI models
Currently, the Open Source Initiative is working to define truly open source AI because existing licenses don't fully address the technical realities of modern AI models. Meeting genuine open source standards requires more than just sharing code—you need to provide training data, training code, model parameters, and much more. Code should be shared under open source licenses, while training data and documentation should use Creative Commons or similar open licenses.
Here's the thing about open source licenses: the strictest ones actually require you to publicly release everything you build with them and credit the original developers. That's the baseline.
What Are Open AI Models?
Open AI models occupy the middle ground between locked-down proprietary systems and the idealized vision of truly open source AI. (Until OSI releases their formal definition, OLMo 7B comes closest to that ideal.)
Simply put, open AI models are made available for free under certain conditions. Usually you can download them from Hugging Face and similar platforms, then run them locally after accepting whatever license terms come with them. You typically can fine-tune them on your own data to build custom versions, create chatbots, and develop applications. In most cases, you can examine model weights and system architecture to understand how they work (to a reasonable degree).
Open licenses still permit broad usage, but they include restrictions you won't find in true open source licenses. For example, Llama 3's license allows commercial use up to 700 million monthly active users and blocks certain use cases. You or I could build something with it, but Apple and Google couldn't. Similarly, Gemma 2's acceptable use policy prohibits "facilitating or encouraging users to engage in any form of criminal activity." That makes sense—Google doesn't want unethical Gemma-powered chatbots flooding social media.
These restrictions, while understandable, contradict open source philosophy. That's why this whole space has become contentious. Many researchers are developing classification systems to clarify exactly how open different models actually are. If any of these gain traction, we'll definitely keep you posted.
The Best Open and Open Source AI Models Today
Here's a rundown of the most noteworthy open and open source models available now. Where they fall on the open source-to-open spectrum is still being debated until we get a solid definition.
| AI Model | Developer | Model Type | License | Parameters | Notes |
|---|---|---|---|---|---|
| Llama 3.1 | Meta | LLM | Custom | 8B, 70B, 405B | Usage restrictions and user cap limits |
| Gemma 2 | LLM | Custom | 2B, 9B, 27B | Restricted user categories | |
| Phi-3 | Microsoft | LLM | MIT | 3.8B, 7B, 14B | |
| Mixtral 8x7B | Mistral | LLM | Apache 2.0 | 8x7B | |
| Mistral 7B | Mistral | LLM | Apache 2.0 | 7B | |
| DBRX | Databricks/Mosaic | LLM | Custom | Equivalent to 36B | Mixture of Experts—parameter counting is complex |
| OLMo 7B | Allen Institute for AI | LLM | Apache 2.0 | 7B | Closest you'll get to truly open source AI right now |
| FLUX.1 [schnell] | Black Forest Labs | Image Generator | Custom | N/A | Non-commercial use only |
| FLUX.1 [dev] | Black Forest Labs | Image Generator | Apache 2.0 | N/A | |
| Stable Diffusion | Stability AI | Image Generator | Custom | N/A | Earlier versions including 1.5, 2.1, and SDXL available under open licenses |
Should You Use Open or Open Source AI?
While truly open source AI models aren't as plentiful as we'd like, the best open models are surprisingly competitive with proprietary alternatives. Llama 3 405B and FLUX.1 can go toe-to-toe with GPT-4o and DALL·E 3. What's interesting here is: if you have the technical chops to work with open source models, you can achieve similar results for significantly less money with substantially more freedom.
Description: Explore the differences between proprietary, open, and truly open source AI models. Learn why Meta's Llama isn't actually open source.
Related Articles
- 8 Effective Methods to Monitor Your Hard Drive Health and Catch Problems Early
- How to Fix the Task Host Window Blocking Windows Shutdown
- How to Restore a Windows System Using UEFI-Compatible .tib Ghost Files
- Understanding Mesh WiFi: How Does a Mesh Network System Actually Work?
- The 22 Best Tools for Creating Bootable USB Drives
No Comment to " Understanding Open Source AI: What Actually Counts as Open? "