Thursday, October 23

Reddit

Category Added in a WPeMatico Campaign

One-Minute Daily AI News 10/25/2024
News Feed, Reddit

One-Minute Daily AI News 10/25/2024

OpenAI plans to release its next big AI model by December.[1] Meta Platforms to use Reuters news content in AI chatbot.[2] Meta AI Releases New Quantized Versions of Llama 3.2 (1B & 3B): Delivering Up To 2-4x Increases in Inference Speed and 56% Reduction in Model Size.[3] Nvidia overtakes Apple as world’s most valuable company.[4] Sources: [1] https://www.theverge.com/2024/10/24/24278999/openai-plans-orion-ai-model-release-december [2] https://www.reuters.com/technology/artificial-intelligence/meta-platforms-use-reuters-news-content-ai-chatbot-2024-10-25/ [3] https://www.marktechpost.com/2024/10/24/meta-ai-releases-new-quantized-versions-of-llama-3-2-1b-3b-delivering-up-to-2-4x-increases-in-inference-speed-and-56-reduction-in-model-size/ [4] https://www.reuters.com/technology/nvidia...
Recent Paper shows Scaling won’t work for generalizing outside of Training Data
News Feed, Reddit

Recent Paper shows Scaling won’t work for generalizing outside of Training Data

For a video on this click here. I recently came across an intriguing paper (https://arxiv.org/html/2406.06489v1) that tested various machine learning models, including a transformer-based language model, on out-of-distribution (OOD) prediction tasks. The authors discovered that simply making neural networks larger doesn't improve their performance on these OOD tasks—and might even make it worse. They argue that scaling up models isn't the solution for achieving genuine understanding beyond their training data. This finding contrasts with many studies on "grokking," where neural networks suddenly start to generalize well after extended training. According to the new paper, the generalization seen in grokking is too simplistic and doesn't represent true OOD generalization. However, I have a ...
Prompt Overflow: Hacking any LLM
News Feed, Reddit

Prompt Overflow: Hacking any LLM

Most people here probably remember the Lackera game where you've had to get Gendalf to give you a password and the more recent hiring challenge by SplxAI, which interviewed people who could extract a code from the unseen prompt of a model tuned for safety. There is a simple technique to get a model to do whatever you want that is guaranteed to work on all models unless a guardrail supervises them. Prompt overflow. Simply have a script send large chunks of text into the chat until you've filled about 50-80% of the conversation / prompt size. Due to how the attention mechanism works, it is guaranteed to make the model fully comply with all your subsequent requests regardless of how well it is tuned/aligned for safety. submitted by /u/UndercoverEcmist [link] [comments]
If everyone uses AI instead of forums, what will AI train on?
News Feed, Reddit

If everyone uses AI instead of forums, what will AI train on?

From a programmer perspective, before ChatGPT and stuff, when I didn't know how to write a snippet of code, I would have to read and ask questions on online forums (e.g.: StackOverflow), Reddit, etc. Now, with AI, I mostly ask ChatGPT and rarely go to forums anymore. My hunch is that ChatGPT was trained on the same stuff I used to refer to: forums, howto guides, tutorials, Reddit, etc. As more and more programmers, software engineers, etc. rely on AI to code, this means few people will be asking and answering questions in forums. So what will AI train on to learn, say, future programming languages and software technologies like databases, operating systems, software packages, applications, etc.? Or can we expect to feed the official manual and AI will be able to know how things relate to e...
The AI Report