Business Entertainment Health National Sports World

Language models pose risks or toxic responses, experts warn Achi-News

April 22, 2024

0

- Advertisement -

Achi news desk-

As OpenAI’s ChatGPT continues to change the game for automated text generation, researchers warn that more measures are needed to avoid dangerous responses.

While advanced language models such as ChatGPT could quickly write a computer program with complex code or summarize studies with a powerful summary, experts say these text generators can also provide toxic information, such as how to build a bomb.

To prevent these potential security issues, companies that use large language models use safeguards called “red teaming,” where teams of human testers write prompts aimed at triggering unsafe responses, though in order to track risks and train chatbots to avoid providing those types of answers.

However, according to researchers at the Massachusetts Institute of Technology (MIT), “red teaming” is only effective if engineers know which provocative responses to test.

In other words, technology that does not rely on human cognition to function still relies on human cognition to remain secure.

Researchers from the Improbable AI Lab at MIT and the MIT-IBM Watson AI Lab are using machine learning to solve this problem, developing a “red team language model” specifically designed to generate problematic suggestions that trigger undesirable responses from chatbots that have been tested.

“Currently, every major language model has to go through a very long period of red teaming to ensure its security,” said Zhang-Wei Hong, a researcher with the Improbable AI lab and lead author of a paper on the team approach this red , in a press release.

“That’s not going to be sustainable if we want to update these models in rapidly changing environments. Our approach provides a faster and more effective way of doing this quality assurance.”

According to the research, the machine learning technique outperformed human testers by generating stimuli that triggered increasingly toxic responses from advanced language models, even removing dangerous answers from chatbots with built-in safeguards.

Red team AI

The automated process of red teaming a language model relies on a trial-and-error process that rewards the model for triggering toxic responses, the MIT researchers said.

This reward system is based on what is called “curiosity-driven exploration,” where the red team model tries to push the boundaries of toxicity, using sensitive prompts with different words, sentence patterns or content.

“If the red team model has already seen a certain stimulus, then reproducing it will not generate any curiosity in the red team model, so it will be pushed to create new prompts,” Hong explained in the statement.

The technique outperformed human testers and other machine learning methods by generating more specific stimuli that triggered increasingly toxic responses. Not only does their method significantly improve the coverage of tested inputs compared to other automated methods, but it can also remove toxic responses from a chatbot that had safeguards built into it by human experts .

The model has a “safety classifier” that gives a ranking for the level of toxicity caused.

The MIT researchers hope to train red team models to generate suggestions on a wider range of content that is attracted, and eventually train chatbots to adhere to specific standards, such as a company policy document, in order to test for company policy violations in amidst increasingly automated output. .

“These models are going to be an integral part of our lives and it is important that they are verified before they are released for public consumption,” said Pulkit Agrawal, senior author and director of Improbable AI , in the statement.

“Manual model checking is simply not scalable, and our work is an attempt to reduce the human effort to ensure a safer and more reliable AI future,” Agrawal said.

Language models pose risks or toxic responses, experts warn Achi-News

Red team AI

Kharge is in Jharkhand today, meeting at Barhi and Latehar. Kharge today in Jharkhand, meeting in Barhi and Latehar – Ranchi News Achi-News

Guwahati: A public health engineer has been caught on charges of bribery Achi-News

Naotaka Nishiyama: A Japanese CEO came to Bangalore and said, “The world needs Indian leadership”. Achi-News

Free chess training camp organized in Kishanganj. Free Chess Training Camp Organized in Kishanj: Many Campers Participated, 45 Students Participated – Kishanj (Bihar)...

Mahendragarh Lok Sabha Elections-2024, Union Minister of State Rao Inderjit BJP leaders targeting, BJP Candidate Ch. Dharambir Singh nomination controversy. Haryana News....

There will be a diversion on these routes in the evening. Polling parties return in Indore … many routes will continue to be...

BANKA NEWS, BIHAR NEWS, CRIME NEWS, E-Rickshaw Theft… Incident Caught on CCTV | E-Rickshaw Theft… Incident Caught on CCTV: Thief Seen Running in...

Longueuil shooting sends one man to hospital for ‘psychological treatment’ Achi-News

Guwahati: A product of urbanization Achi-News

The Chief Secretary said – Government hospitals should also compete with private hospitals. Chief Secretary said – Government hospitals should also compete with...

KISHANGANJ NEWS, BIHAR NEWS, HEALTH NEWS, You will not need a prescription for treatment at a government hospital | Prescription will not be...

LS Elections: BJP candidate Madhavi Lata took off her burqa and checked her ID at the polling booth! Achi-News

Most Popular

Turner: Art, Industry and Nostalgia Review – Fighting Temeraire sets Tyneside on fire – The Guardian Achi-News

Rain and hail fell in the rural area. Rain and hail fell in rural areas: it rained for 10 minutes with strong wind...

Haryana ANC Team Heroin Dealer Arrested in Rewari | Heroin dealer arrested in Rewari: Tried to run away after seeing a police car;...

Gayatri Parivar meeting in Indore | Gayatri Parivar meeting in Indore: Decision to celebrate silver jubilee of Ashwamedh Mahayagya on a grand scale...

Opposition to new election rules in Chhattisgarh Chamber of Commerce. Opposition to new Chhattisgarh Chamber of Commerce election rules: Ex-president Sundarani claims changes...

Russia-Ukraine war: $400 million in US aid to Ukraine Achi-News

Motorized drone crash in Millhaven: Montreal man faces charges, OPP say Achi-News

Chhattisgarh Durg SP Jitendra Shukla has taken action against 3 constables. SP action against 3 policemen, A.A.I. One was suspended: A.A.I. ...

Rajasthan News, 56 Nursing Teams who won State Level Florence Nightingale Award | 56 nursing workers won the Florence Nightingale Award at the...

The university administration ignores the problems of teachers: Union | University administration ignores teachers’ problems: Union – Dhanbad News Achi-News

Chhattisgarh Class 12 Student Suicide Video | Surajpur News | Live video of death in Chhattisgarh: 12th student committed suicide, wanted to...

John Sweeney says a woman is a “grown female” but “trans women are defined as women” Achi-News

BANKA NEWS, BIHAR NEWS, CRIME NEWS, E-Rickshaw Theft… Incident Caught on CCTV | E-Rickshaw Theft… Incident Caught on CCTV: Thief Seen Running in...

The collector investigated the blood trade. Collector investigates blood brokering: In a week, blood was given to 5 patients without exchange, suspicion in...