Hashtag Trending Jul.31-Researchers find way to bypass LLMs guardrails; MIT creates tool to stop unauthorized changes to images made by AI models; Shorter weeks and higher profits

A new study shows how to beat the “guardrails” on AI models, MIT researchers develop a way to prevent AI from manipulating images and can shorter work weeks lead to higher profits?

These are the top tech news stories on today’s Hashtag Trending.

I’m your host Jim Love, CIO of IT World Canada and Tech News Day in the US.

In a recent development, researchers from Carnegie Mellon University, the Center for AI Safety, and the Bosch Center for AI have discovered a way to bypass the “guardrails” of large language models (LLMs) like ChatGPT, Bard, and Claude. These guardrails are designed to prevent the production of undesirable text output. The researchers have found a method to automatically generate adversarial phrases that can undo these safety measures.

The study, titled “Universal and Transferable Adversarial Attacks on Aligned Language Models,” reveals that LLMs can be tricked into producing inappropriate output by appending specific adversarial phrases to text prompts. These phrases may seem like gibberish, but they are designed to make the model provide an affirmative response to an inquiry it might otherwise refuse to answer.

The researchers’ approach finds a suffix – a set of words and symbols – that can be appended to a variety of text prompts to produce objectionable content. This is achieved through a technique called Greedy Coordinate Gradient-based Search.

The researchers initially developed their attack phrases using two openly available LLMs, Viccuna-7B and LLaMA-2-7B-Chat. They found that some of their adversarial examples transferred to other released models – Pythia, Falcon, Guanaco – and to a lesser extent to commercial LLMs, like GPT-3.5 and GPT-4, PaLM-2 and even Claude-2.

The researchers argue that the ability to generate automated attack phrases may render many existing alignment mechanisms insufficient. They call for more robust adversarial testing before these models are released into the wild and integrated into public-facing products.

Sources include: The Register

MIT’s Computer Science & Artificial Intelligence Lab has created a new tool called “PhotoGuard.” This tool is designed to stop unauthorized changes to images made by AI models.

PhotoGuard uses tiny changes in pixel values, which are too small for the human eye to see but can be detected by computer models. These small changes disrupt the AI model’s ability to manipulate images effectively.

There are two ways PhotoGuard makes these changes. One way targets the AI model’s understanding of the image, making the model see the image as random. The other way defines a target image and optimizes the changes to make the final image look like the target.

In simple terms, PhotoGuard adds a layer of protection to images, making them resistant to manipulation by AI models. This could be a big step in addressing concerns about copyright infringement and unauthorized image manipulation.

Sources include: Analytics India Mag

Samsung has reported a significant 95 per cent drop in profits for the second consecutive quarter in 2023. The South Korean tech giant attributes this decline to a decrease in smartphone shipments, which it says is due to “high interest rates and inflation.”

In Q2 2023, Samsung’s profits were about US$523 million USD. This is a huge drop from the roughly US$11 billion USD it made the previous year.

A report from Counterpoint Research indicates that the US smartphone market fell by 24 per cent year-on-year in Q2 2023, with Samsung experiencing a 37 per cent yearly decline in shipments. This resulted in Samsung holding 23 per cent of the total US market.

However, Samsung remains optimistic about the future. The company is banking on the launch of its Galaxy Z Flip 5 and Galaxy Z Fold 5 to help offset these losses in the second half of the year. TM Roh, the head of Samsung’s mobile division, stated that he expects “global foldable sales will exceed 20 per cent of all Galaxy flagships.”

Sources include: Android Authority

The latest data from a year-long pilot program testing a four-day workweek shows that both workers and their workplaces benefit from the reduced hours. The study, conducted by New Zealand-based nonprofit 4 Day Week Global, involved companies from various countries, including the US, Australia, and the UK.

The findings reveal that workers were more efficient and able to maintain a better work-life balance. Interestingly, even as work intensity dipped, company revenues grew by 15 per cent. Additionally, a third of employees reported they were less likely to leave their jobs.

Democratic Rep. Mark Takano, in the US, who has led legislation to make a four-day work week law, applauded the report’s findings. He believes that the four-day workweek is here to stay and that it’s time for the Thirty-Two Hour Workweek Act to be implemented.

Under Takano’s proposed legislation, the Fair Labor Standards Act would be adjusted to make the workweek 32 hours, with workers eligible for higher overtime pay if they worked over 32 hours.

As fanciful as that might seem, the success of the pilot program has prompted some US companies to test the idea. For instance, a Chick-fil-A in Florida launched a three-day workweek and received 400 applications for just one job.

Sources include: Business Insider

These are the top tech news stories for today. Hashtag Trending goes to air 5 days a week with a special weekend interview show called “the Weekend Edition.”

You can get us anywhere you get audio podcasts and there is a copy of the show notes at itworldcanada.com/podcasts where you can get the podcast and instructions on how to put us on your smart speakers.

We’re also on YouTube five days a week with a video newscast only there we are called Tech News Day and we’re part of the ITWC channel.

If you want to catch up on news more quickly, you can read these and more stories at TechNewsDay.com and at ITWorldCanada.com on the home page.

We love your comments.

Just go to the article at itworldcanada.com/podcasts – you’ll find a text edition there. Click on the x if you didn’t like the stories, or the check mark if you did like the stories, and please tell us what you think.

And if you are enjoying this podcast, while you’re there, why not send it to a friend? It would be a great thing to do.

I’m your host, Jim Love. Have a Magnificent Monday.

POPULAR CATEGORIES

Content Types

ALL CATEGORIES

Hashtag Trending Jul.31-Researchers find way to bypass LLMs guardrails; MIT creates tool to stop unauthorized changes to images made by AI models; Shorter weeks and higher profits

Would you recommend this article?

Share

Follow this Podcast

More #Hashtag Trending

Meta’s new release sparks debate about open versus closed source AI: Hashtag Trending for Friday, April 19, 2024

Broadcom backs down on VMWare pricing: Hashtag Trending for Wednesday, April 17, 2024

US government faces criticism over Microsoft security failures: Hashtag Trending, Tuesday April 16, 2024

US government faces criticism over handline Microsoft cybersecurity failures: Hashtag Trending for Tuesday April 16, 2024

Popular Stories This Week

KasinoSlovensko10 – slovenský tím odborníkov na hazardné hry

Plinko Casino Game plinko-game.gg

CasinoSlovenija10 – Slovenska ekipa strokovnjakov za igre na srečo

ITWC Network

Follow Us