🧠Creating My Own Uncensored AI to Help Me Hack (Legally, Calm Down)

aldern00b
May 12
4 min read

Updated: May 13

Let's be honest - most public AI models are basically neutered. you ask them something spicy like, "What ports does SMB use?" and suddenly you're being lectured on ethics. That's great for grandma but for someone deep in cyber analytics, curious about Red Team tactics and needing to test scenarios in a safe, private lab - this is a problem.

So, I built my own AI. Fully offline. Uncensored. Running in my home lab on nothing but stubbornness, spare CPU cores and caffeine. This was part learning exercise, part rebellion against corporate AI censorship and part "can I actually do this without bricking my Proxmox server?"

Spoiler: I can. And I did.

🛠️The Setup: Proxmox, Ubuntu and Why the Hell Not?

First, I spun up a new Ubuntu LTS VM in my Poxmox environment. I won't bore you with the step-by-step here - if you've ever installed an OS and cursed at a terminal, you know the vibe.

I maxed out CPU allocation (12 cores across dual CPUs) because, well, no GPU. We're flying economy class on compute here but this is about proof of concept, not production speed. Like Frankenstein's monster, I just wanted it alive.

🐙Step 1: Summoning Ollama

curl -fsSL https://ollama.com/install.sh | sh

This one-liner fetches and installs Ollama, a tool that lets you run LLMs locally. It installs fast, warns you that there's no GPU, then quietly judges your life choices when the CPU starts groaning. The local API endpoint pops up (127.0.0.1:11434) - write that down. You'll forget. I did. Multiple times.

🤖Step 2: Choosing the Forbidden Model

I wanted uncensored, unfiltered, un-neutered answers. So I went with llama20uncensored. Yeah, it's edgy. Yeah, it crashed my VM. A lot. Turns out these models really don't like CPU-only environments.

Solution? SSH in remotely. Loading the model via SSH prevented the system from logging me out or full-blown dying. It was slow, but it worked. Like strapping a jet engine to a tricycle and watching it lurch forward in slow motion.

Did I test if it was truly uncensored? Let's just say I asked it things that would make ChatGPT turn into a motivational speaker. And yup - it passed.

📚 Step 3: Feeding It My Brain

Time to teach this thing. I wanted it to know me - specifically, my red team research, notes and blog posts. For this, I used OpenWebUI, a clean Docker-based front-end that supports document uploads and RAG (Retrieval-Augmented Generation).

Install was easy:

sudo apt install docker.io

sudo docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

I had some config issues getting it to talk to Ollama. The fix? Edit the ollama.service file and bind the service to 0.0.0.0:11434 (basically listen on all networks with this port)

sudo nano /etc/systemd/system/ollama.service

[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"

After the modification, reload the services:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Now I could access the AI locally on http://localhost:3000 (or the VM's IP on that same port from another networked computer/phone)

🧠 Step 4: Training Day - RAG-Style

Inside OpenWebUI, I created a new Knowledge Base just for Red Team testing. I dragged and dropped one of my own blog articles (PDF format) into it. Boom - it parsed it, indexed it, and now the AI had access to my own thoughts.

From the Models area I created a matching Model that pointed to this new found knowledge. The model will use our existing LLM llama2 but augment it with my super-powers... and also act like Dade Murphy... who we all love and admire - to this day.

From a new chat, I selected my uncensored model + custom knowledge base, and asked a specific question related to the uploaded article. It responded perfectly - referencing content from the document itself.

This was legit RAG in action and now, I had a local AI that:

Understood my Red Team context
Operated without internet access
Wasn't shackled by corporate guardrails
Could be expanded with more personal intel

⚠️The Trade-Offs: Speed vs. Sanity

did it work? Yes. Was it fast? Hell no. This thing chewed through CPU like a starving raccoon in the trash can (That reminds me, I need to buy a new green bin... they tore that thing apart). It worked but barely. If i plan to use this regularly, I'll need a NVidia GPU and probably some cooling fans that don't sound like jet turbines during takeoff.

But the proof of concept is solid.

🧠Why This Matters

For cybersecurity professionals (which I am not), especially Red Teamers, having an offline AI model means freedom. It means autonomy. You're not limited by OpenAI's rules or Google's definition of "safety". You're a big boy (or girl) and you define the boundaries, and you own the data.

For me, this was more than just a weekend project. It was a demonstration of self-reliance - a middle finger to the idea that AI has to be cloud-based, censored, or corporately neutered. It's a small step toward building an AI assistant tailored to my workflow, my data and my curiosity.

And trust me - once you build your own AI that you FULLY control?

You'll never want to go back.

🔮What's Next?

Get a dedicated GPU system to handle model loading faster.
Test more advanced uncensored models like llama3 or Mistral with multimodal support
Expand the knowledge base with full Red Team playbooks, MITRE techniques, tool documentation and even internal logs.
Maybe build an API layer to automate querying during live pen tests.

Got questions? Want to try it yourself? Or just want to argue about AI ethics while pretending we're in a cyberpunk noir? Hit me up.