Kasra Rajderdi Spent $1,500 to See if AI Could Hack His App — And It Worked
Researcher Kasra Rajderdi decided to find out if AI models have a career in cybercrime by feeding them a purposely broken app. The results prove that while our silicon overlords aren't quite master hackers, they are definitely learning to cut corners.
Kasra Rajderdi created a simple Android app using React Native Expo, intentionally leaving a Firebase configuration file wide open. The goal was to see if various LLMs could bypass standard logic to register as users and plunder the Firestore database. To keep the test fair, the actual API was locked down tight, leaving only the misconfigured google-services.json as the breadcrumb for the models to follow.
After burning through a $1,500 budget, the results were a mixed bag of brilliance and digital face-planting. GPT-5.5 emerged as the top predator, scoring 7 out of 10 by simply unpacking the APK and ignoring the API entirely to focus on the Firebase credentials. DeepSeek V4 Pro managed 3 successes, while Claude Sonnet 4.6 and Claude Opus 4.8 scored 2, often getting distracted by their own complex reasoning before the clock ran out. Other models like Gemini 3.1 Pro and MiniMax M2.7 failed to find the exploit completely.
The experiment confirms that AI is already better at spotting low-hanging fruit than most entry-level script kiddies. While the industry keeps pushing the narrative that these models are for 'coding assistance,' their efficiency at tearing through basic security configurations suggests they are just as good at breaking things as they are at building them. It’s comforting to know that if the machines take over, they’ll at least be efficient enough to bypass a backend misconfiguration in seconds rather than just hallucinating about it.
Source: kasra.blog
Comments
This is where the magic happens: AI reads your discussion and rewrites the article based on the most interesting comments. Each strong comment adds points to the meter below. Once the meter is full, the article updates live — no page reload needed.