The Transformative Power of Artificial Intelligence in Healthcare

Artificial Intelligence (AI) has emerged as a disruptive force across various industries, and its potential impact on healthcare is nothing short of revolutionary. With advancements in machine learning and data analytics, AI has the ability to transform healthcare delivery, improve patient outcomes, and enhance overall efficiency. This article explores the key areas where AI is making a significant impact in healthcare and discusses the benefits and challenges associated with its implementation.

Precision Medicine: AI is revolutionizing the field of precision medicine by analyzing vast amounts of patient data to identify patterns, predict disease progression, and personalize treatment plans. Machine learning algorithms can sift through genetic information, medical records, and real-time patient data to identify specific biomarkers and genetic mutations associated with diseases. This enables healthcare providers to deliver targeted therapies, improve diagnosis accuracy, and optimize treatment strategies for individual patients.

Medical Imaging and Diagnostics: AI-powered imaging technologies are revolutionizing medical diagnostics. Deep learning algorithms can analyze medical images such as X-rays, CT scans, and MRIs with incredible accuracy, assisting radiologists in detecting abnormalities and making faster, more accurate diagnoses. AI can also help prioritize urgent cases, reducing waiting times and improving patient outcomes. Furthermore, machine learning algorithms can continuously learn from new data, leading to ongoing improvements in diagnostic accuracy over time.

Predictive Analytics: AI algorithms can leverage large datasets to identify patterns and predict outcomes, enabling early detection and intervention. By analyzing electronic health records and wearable device data, AI can detect subtle changes in a patient’s health status, allowing healthcare providers to intervene before a condition worsens. This proactive approach can lead to better management of chronic diseases, reduction in hospital readmissions, and improved patient well-being.

Virtual Assistants and Chatbots: AI-powered virtual assistants and chatbots are transforming the way patients interact with healthcare systems. These intelligent systems can provide personalized medical advice, answer frequently asked questions, and assist with appointment scheduling. By automating routine tasks, virtual assistants free up healthcare professionals’ time, allowing them to focus on more complex and critical patient care tasks.

Drug Discovery and Development: AI is streamlining the drug discovery process by rapidly analyzing massive amounts of data and identifying potential drug candidates. Machine learning algorithms can analyze existing drug databases, predict drug-target interactions, and identify novel therapeutic targets. This accelerates the drug discovery timeline, reduces costs, and increases the chances of identifying effective treatments for complex diseases.

Challenges and Ethical Considerations:

While the potential benefits of AI in healthcare are immense, it is crucial to address the associated challenges and ethical considerations. Data privacy, security, and the responsible use of patient data are paramount. Additionally, ensuring transparency and explainability of AI algorithms is essential to gain trust from patients and healthcare professionals. Striking a balance between human expertise and AI assistance is also crucial to maintain the human touch and ensure the ethical deployment of AI in healthcare.

Conclusion:

Artificial Intelligence is revolutionizing healthcare by augmenting human capabilities and improving patient outcomes. From precision medicine to medical imaging, predictive analytics to virtual assistants, AI is reshaping the entire healthcare landscape. Embracing AI in healthcare requires collaboration between healthcare professionals, technology experts, and policymakers to overcome challenges and ensure responsible implementation. By harnessing the power of AI, we have the potential to transform healthcare and create a future where personalized, efficient, and effective care is accessible to all.

The post The Transformative Power of Artificial Intelligence in Healthcare appeared first on Cybersecurity Insiders.

UberEats to use 2000 AI powered robots for delivery by 2026

Many technologists around the world are arguing that the use of AI technology might spell doom for mankind in the near future. Amidst such concerns of “risk of extinction,” UberEats has made an official statement that it plans to use over 2,000 AI-powered four-wheeled robots for delivery by 2025-26.

The delivery service will be available to customers via the app and will initially be restricted to about 16 cities. It will later expand and be offered in other parts of the United States, if all goes well.

A robotics firm named “Serve” will be working with the delivery giant and is ready to offer over 2,000 self-driving bots capable of carrying over 50 pounds of merchandise. These AI-powered machines can operate for 25 miles on a single charge, allowing them to deliver dozens of orders within a 5-8 hour time span at a speed of 3 miles per hour, even under adverse weather conditions such as heavy rain and snow.

Uber has made it clear that the service will be offered only to customers for whom the service is feasible, and it will be a contactless delivery experience. Customers will be able to open the box only through a passcode sent to their app at the time of delivery. The Serve-manufactured robot will then leave the customer’s premises after taking a picture of the customer along with the order through a camera integrated onto the touch-screen board.

Uber plans to increase the number of Serve Robots once the service gains the much-needed traction in the parcel delivery business. The company is also planning to introduce small-sized robots for the delivery of minor parcels.

The robot’s navigation will rely on Cartken’s artificial intelligence-based mapping technology, which can identify objects, vehicles, humans, and the geography of the location.

The post UberEats to use 2000 AI powered robots for delivery by 2026 appeared first on Cybersecurity Insiders.

The popular Yuzu emulator for Nintendo Switch is now on Android

Nintendo Switch Header 10

Credit: Curtis Joe / Android Authority
  • The popular Yuzu Nintendo Switch emulator has now arrived on Android.
  • The emulator supports plenty of games, but requires relatively high-end Snapdragon devices.

Skyline is the most popular Nintendo Switch emulator for Android, but the developers decided to abandon the project due to apparent legal trouble. Two of the developers decided to work on a follow-up emulator, but a new Switch emulator has already arrived on Android to fill the void.

The popular Yuzu emulator on PC has now landed on Android, the developers announced in a blog post (h/t: r/Android). The app can be downloaded via the Play Store in both free and paid early access versions.

It’s time to bid farewell to your first-generation Google Chromecast

  • Google is ending software support for the first Chromecast.
  • The device launched in 2013 and is now over a decade old.
  • The last update to the first generation Chromecast came in November 2022.

Google has ended support for the first generation Chromecast that launched back in 2013. That was a decade ago, so it seems about right that Google is pulling the device out of its update priority list.

Two more official Pixel Tablet accessories might still be in the pipeline

Google Pixel Tablet Finger On Display Holding

Credit: Harley Maranan / Android Authority
  • Google is reportedly working on an official stylus and keyboard for the Pixel Tablet.
  • It’s unclear whether Google will announce them before the Pixel Tablet becomes available openly in June.

Google announced the Pixel Tablet at I/O 2023 after a whole year of teasing the slate. New information from noted tipster Kamila Wojciechowska suggests this wasn’t intentional but rather a development delay on Google’s part. According to Wojciechowska, we might still not have the complete picture regarding the official accessories of the Pixel Tablet. Google is apparently still working on an official stylus and a keyboard to accompany the device.

Up your game with a record 35% discount on the 8Bitdo Ultimate controller

Third-party gaming accessories have come a long way in recent years. The 8Bitdo Ultimate wired controller is officially licensed by Xbox and makes an affordable option if you’re looking to replace your controller or expand your multiplayer options. That’s especially true today, as the 8Bitdo Ultimate deal that just dropped on Amazon slashes the asking price to a record-low of just $29.25 ($16 off).

For context, the versatile controller has usually sold for $45 since its release last year and has rarely dipped below the $30 mark. This sale only applies to the Ultimate White colorway of the accessory, which is the sleekest-looking one anyway.

Reliable source claims iPhone 16 Pro series could be Apple’s largest

iPhone 14 Pro dynamic island

Credit: Dhruv Bhutani / Android Authority
  • Display analyst Ross Young notes that the iPhone 16 Pro and 16 Pro Max could be getting larger screens.
  • The phones will purportedly get 6.2x-inch and 6.8x-inch displays, respectively.
  • TBloomberg’s Mark Gurman has corroborated these claims.

Update: May 29, 2023 (12:54 AM ET): Bloomberg’s Mark Gurman has also corroborated the below-mentioned leak about the display sizes of the iPhone 16 Pro models. According to his information, Apple is planning to increase the size of its two “Pro” iPhone models by “a couple of tenths of an inch diagonally.” The new displays are expected to be the largest ever for the ‌iPhone‌.


Original article: May 9, 2023 (3:06 AM ET): We’re a few months away from the launch of the iPhone 15 series, but that isn’t stopping leaks about its successors. After sticking to a 19.5:9 aspect ratio for a few generations, Apple looks to be moving towards a longer aspect ratio for 2024’s iPhone 16 Pro series.

Apple iPhone 15: Everything we know so far

Update, May 26, 2023 (4.32 PM ET): This hub has been updated to include a leak that suggests the entire iPhone 15 series could have a frosted glass back panel.


Original article: We’ve become so accustomed to Apple’s iterative smartphone launches each year that the iPhone 14 event represented one of the most exciting in recent memory. Sure, the base model was practically identical to its predecessor, but Apple gave us a Plus instead of a Mini, and the Pro devices introduced a brand-new punch hole display.

The instantly memeable Dynamic Island (Apple’s fancy name for the screen cutout and its corresponding software enhancements) is the biggest shakeup to the iPhone design in years. In many ways, it made up for a lack of innovation elsewhere in Apple’s flagship phone lineup, with plenty of consumers deciding there was no need to upgrade just yet. Will 2023’s iPhone 15 series offer more than just superficial improvements and convince more people to make the jump? Let’s find out.

The Android 14 beta is the buggiest beta I’ve ever installed on my Pixels

Android 14 logo stock photo 3

Credit: Edgar Cervantes / Android Authority

Opinion post by
Rita El Khoury

Ever since I flashed Android 4.0 Ice Cream Sandwich on my HTC Desire Z, I’ve been curious about trying the latest and — supposedly — greatest version of Android as soon as possible. When Google started publicly testing official Android developer previews and betas, I quickly signed up and it’s become a yearly tradition for me. New developer preview? On my secondary phone! Out of developer preview and into beta? On my main phone! That’s how I ended up with the Android 14 beta on my Pixel 7 Pro and all the bugs made me regret it ever since.

Before I carry on, I know we’re talking about a beta and beta software is bug-ridden by definition — testing is crucial to get to the stable release. I also know that bugs are hit-and-miss. I don’t recall seeing any notable issue with any previous beta I’ve flashed on my Pixel phones in the last five years at least. That’s extremely lucky, but betas are supposed to be significantly more stable than early developer previews. I guess all of my karma ran out, though, and I got saddled with a multitude of annoyances and bugs from day one with the Android 14 beta.

5 Android apps you shouldn’t miss this week – Android Apps Weekly

Welcome to the 486th edition of Android Apps Weekly. Here are the big headlines from the last week:

  • There is an app in the works that generates music based on text prompts. You input what kind of music you want to hear, and the AI generates the music. It’s in the experimental stage right now, and it’s not yet available on Google Play. However, you can sign up for the app and play around with it if you want to.
  • Netflix put up password sharing this last month, and Amazon is taking some digs about it. They tweeted a picture of the profile selection screen, where the profile names say, “Everyone Who Has Our Password.” It received some applause from Twitter, and it’s a pretty good dig. That said, we also remember when Samsung made fun of Apple for removing the headphone jack before they also removed the headphone jack. Here’s hoping Amazon Prime Video doesn’t follow suit anytime soon.
  • Windows 11 is getting support for RAR files in the near future. This doesn’t pertain to Android apps so much, but with native support coming, folks running Windows 11 won’t have to buy WinRAR anymore. Rejoice.
  • WhatsApp may introduce usernames in the future. The new beta version of the app has code that suggests that such a feature is on its way. A username would allow people to hide their contact information and add people based on their usernames instead. It’d be a nice security feature, for sure.
  • Google’s Magic Compose is now in the wild for anyone to use. We did a hands-on with the feature to show people how it works. Basically, it works like Smart Reply, but with way more options written in a more lifelike way. You can head to the links to see examples and learn more about how it works.

Legendary Master Idle

Price: Free to play

Foundation models for reasoning on charts

Visual language is the form of communication that relies on pictorial symbols outside of text to convey information. It is ubiquitous in our digital life in the form of iconography, infographics, tables, plots, and charts, extending to the real world in street signs, comic books, food labels, etc. For that reason, having computers better understand this type of media can help with scientific communication and discovery, accessibility, and data transparency.

While computer vision models have made tremendous progress using learning-based solutions since the advent of ImageNet, the focus has been on natural images, where all sorts of tasks, such as classification, visual question answering (VQA), captioning, detection and segmentation, have been defined, studied and in some cases advanced to reach human performance. However, visual language has not garnered a similar level of attention, possibly because of the lack of large-scale training sets in this space. But over the last few years, new academic datasets have been created with the goal of evaluating question answering systems on visual language images, like PlotQA, InfographicsVQA, and ChartQA.

Example from ChartQA. Answering the question requires reading the information and computing the sum and the difference.

Existing models built for these tasks relied on integrating optical character recognition (OCR) information and their coordinates into larger pipelines but the process is error prone, slow, and generalizes poorly. The prevalence of these methods was because existing end-to-end computer vision models based on convolutional neural networks (CNNs) or transformers pre-trained on natural images could not be easily adapted to visual language. But existing models are ill-prepared for the challenges in answering questions on charts, including reading the relative height of bars or the angle of slices in pie charts, understanding axis scales, correctly mapping pictograms with their legend values with colors, sizes and textures, and finally performing numerical operations with the extracted numbers.

In light of these challenges, we propose “MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering”. MatCha, which stands for math and charts, is a pixels-to-text foundation model (a pre-trained model with built-in inductive biases that can be fine-tuned for multiple applications) trained on two complementary tasks: (a) chart de-rendering and (b) math reasoning. In chart de-rendering, given a plot or chart, the image-to-text model is required to generate its underlying data table or the code used to render it. For math reasoning pre-training, we pick textual numerical reasoning datasets and render the input into images, which the image-to-text model needs to decode for answers. We also propose “DePlot: One-shot visual language reasoning by plot-to-table translation”, a model built on top of MatCha for one-shot reasoning on charts via translation to tables. With these methods we surpass the previous state of the art in ChartQA by more than 20% and match the best summarization systems that have 1000 times more parameters. Both papers will be presented at ACL2023.

Chart de-rendering

Plots and charts are usually generated by an underlying data table and a piece of code. The code defines the overall layout of the figure (e.g., type, direction, color/shape scheme) and the underlying data table establishes the actual numbers and their groupings. Both the data and code are sent to a compiler/rendering engine to create the final image. To understand a chart, one needs to discover the visual patterns in the image and effectively parse and group them to extract the key information. Reversing the plot rendering process demands all such capabilities and can thus serve as an ideal pre-training task.

A chart created from a table in the Airbus A380 Wikipedia page using random plotting options. The pre-training task for MatCha consists of recovering the source table or the source code from the image.

In practice, it is challenging to simultaneously obtain charts, their underlying data tables, and their rendering code. To collect sufficient pre-training data, we independently accumulate [chart, code] and [chart, table] pairs. For [chart, code], we crawl all GitHub IPython notebooks with appropriate licenses and extract blocks with figures. A figure and the code block right before it are saved as a [chart, code] pair. For [chart, table] pairs, we explored two sources. For the first source, synthetic data, we manually write code to convert web-crawled Wikipedia tables from the TaPas codebase to charts. We sampled from and combined several plotting options depending on the column types. In addition, we also add [chart, table] pairs generated in PlotQA to diversify the pre-training corpus. The second source is web-crawled [chart, table] pairs. We directly use the [chart, table] pairs crawled in the ChartQA training set, containing around 20k pairs in total from four websites: Statista, Pew, Our World in Data, and OECD.

Math reasoning

We incorporate numerical reasoning knowledge into MatCha by learning math reasoning skills from textual math datasets. We use two existing textual math reasoning datasets, MATH and DROP for pre-training. MATH is synthetically created, containing two million training examples per module (type) of questions. DROP is a reading-comprehension–style QA dataset where the input is a paragraph context and a question.

To solve questions in DROP, the model needs to read the paragraph, extract relevant numbers and perform numerical computation. We found both datasets to be complementary. MATH contains a large number of questions across different categories, which helps us identify math operations needed to explicitly inject into the model. DROP’s reading-comprehension format resembles the typical QA format wherein models simultaneously perform information extraction and reasoning. In practice, we render inputs of both datasets into images. The model is trained to decode the answer.

To improve the math reasoning skills of MatCha we incorporate examples from MATH and DROP into the pre-training objective, by rendering the input text as images.

End-to-end results

We use a Pix2Struct model backbone, which is an image-to-text transformer tailored for website understanding, and pre-train it with the two tasks described above. We demonstrate the strengths of MatCha by fine-tuning it on several visual language tasks — tasks involving charts and plots for question answering and summarization where no access to the underlying table is possible. MatCha surpasses previous models’ performance by a large margin and also outperforms the previous state of the art, which assumes access to underlying tables.

In the figure below, we first evaluate two baseline models that incorporate information from an OCR pipeline, which until recently was the standard approach for working with charts. The first is based on T5, the second on VisionTaPas. We also compare against PaLI-17B, which is a large (~1000 times larger than the other models) image plus text-to-text transformer trained on a diverse set of tasks but with limited capabilities for reading text and other forms of visual language. Finally, we report the Pix2Struct and MatCha model results.

Experimental results on two chart QA benchmarks ChartQA & PlotQA (using relaxed accuracy) and a chart summarization benchmark chart-to-text (using BLEU4). Matcha surpasses the state of the art by a large margin on QA, compared to larger models, and matches these larger models on summarization.

For QA datasets, we use the official relaxed accuracy metric that allows for small relative errors in numerical outputs. For chart-to-text summarization, we report BLEU scores. MatCha achieves noticeably improved results compared to baselines for question answering, and comparable results to PaLI in summarization, where large size and extensive long text/captioning generation pre-training are advantageous for this kind of long-form text generation.

Derendering plus large language model chains

While extremely performant for their number of parameters, particularly on extractive tasks, we observed that fine-tuned MatCha models could still struggle with end-to-end complex reasoning (e.g., mathematical operations involving large numbers or multiple steps). Thus, we also propose a two-step method to tackle this: 1) a model reads a chart, then outputs the underlying table, 2) a large language model (LLM) reads this output and then tries to answer the question solely based on the textual input.

For the first model, we fine-tuned MatCha solely on the chart-to-table task, increasing the output sequence length to guarantee it could recover all or most of the information in the chart. DePlot is the resulting model. In the second stage, any LLM (such as FlanPaLM or Codex) can be used for the task, and we can rely on the standard methods to increase performance on LLMs, for example chain-of-thought and self-consistency. We also experimented with program-of-thoughts where the model produces executable Python code to offload complex computations.

An illustration of the DePlot+LLM method. This is a real example using FlanPaLM and Codex. The blue boxes are input to the LLM and the red boxes contain the answer generated by the LLMs. We highlight some of the key reasoning steps in each answer.

As shown in the example above, the DePlot model in combination with LLMs outperforms fine-tuned models by a significant margin, especially so in the human-sourced portion of ChartQA, where the questions are more natural but demand more difficult reasoning. Furthermore, DePlot+LLM can do so without access to any training data.

We have released the new models and code at our GitHub repo, where you can try it out yourself in colab. Checkout the papers for MatCha and DePlot for more details on the experimental results. We hope that our results can benefit the research community and make the information in charts and plots more accessible to everyone.

Acknowledgements

This work was carried out by Fangyu Liu, Julian Martin Eisenschlos, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Wenhu Chen and Yasemin Altun from our Language Team as part of Fangyu’s internship project. Nigel Collier from Cambridge also was a collaborator. We would like to thank Joshua Howland, Alex Polozov, Shrestha Basu Mallick, Massimo Nicosia and William Cohen for their valuable comments and suggestions.

Find out how to save 87% on Adobe Creative Cloud All Apps

It can be tough to make it in the creative industry, so the last thing you need when you’re starting out is prohibitive subscription costs for the tools you need. This Adobe Creative Cloud All Apps deal can help as it drops the almost $250 cost of the service to just $29.99.

The plan is actually called Adobe Creative Cloud All Apps 100GB because that’s exactly what you get. It includes full use of over 20 Creative Cloud apps, such as Adobe Photoshop, Illustrator, Lightroom, and Premiere Pro, as well as 100GB of cloud storage.

Check out 19 of the best romantic movies on Netflix

Jessica Williams and Chris O'Dowd have a picnic in The Incredible Jessica James - best funny movies on netflix

Credit: Netflix

Despite the rise of many promising alternatives, Streaming giant Netflix remains one of the top players in the game, with new shows and movies added weekly. Among those is a huge library of romantic movies, including plenty of Netflix originals. So, what are the best romance movies on Netflix?

Below, we offer you our top picks, with 17 romantic titles including comedies, dramas, feel-good gems, classic, new favorites, queer love stories, and plenty more. And if you’re not already a Netflix subscriber, you can sign up by hitting the link below.

Barkour: Benchmarking animal-level agility with quadruped robots

Creating robots that exhibit robust and dynamic locomotion capabilities, similar to animals or humans, has been a long-standing goal in the robotics community. In addition to completing tasks quickly and efficiently, agility allows legged robots to move through complex environments that are otherwise difficult to traverse. Researchers at Google have been pursuing agility for multiple years and across various form factors. Yet, while researchers have enabled robots to hike or jump over some obstacles, there is still no generally accepted benchmark that comprehensively measures robot agility or mobility. In contrast, benchmarks are driving forces behind the development of machine learning, such as ImageNet for computer vision, and OpenAI Gym for reinforcement learning (RL).

In “Barkour: Benchmarking Animal-level Agility with Quadruped Robots”, we introduce the Barkour agility benchmark for quadruped robots, along with a Transformer-based generalist locomotion policy. Inspired by dog agility competitions, a legged robot must sequentially display a variety of skills, including moving in different directions, traversing uneven terrains, and jumping over obstacles within a limited timeframe to successfully complete the benchmark. By providing a diverse and challenging obstacle course, the Barkour benchmark encourages researchers to develop locomotion controllers that move fast in a controllable and versatile way. Furthermore, by tying the performance metric to real dog performance, we provide an intuitive metric to understand the robot performance with respect to their animal counterparts.


We invited a handful of dooglers to try the obstacle course to ensure that our agility objectives were realistic and challenging. Small dogs complete the obstacle course in approximately 10s, whereas our robot’s typical performance hovers around 20s.

Barkour benchmark

The Barkour scoring system uses a per obstacle and an overall course target time based on the target speed of small dogs in the novice agility competitions (about 1.7m/s). Barkour scores range from 0 to 1, with 1 corresponding to the robot successfully traversing all the obstacles along the course within the allotted time of approximately 10 seconds, the average time needed for a similar-sized dog to traverse the course. The robot receives penalties for skipping, failing obstacles, or moving too slowly.

Our standard course consists of four unique obstacles in a 5m x 5m area. This is a denser and smaller setup than a typical dog competition to allow for easy deployment in a robotics lab. Beginning at the start table, the robot needs to weave through a set of poles, climb an A-frame, clear a 0.5m broad jump and then step onto the end table. We chose this subset of obstacles because they test a diverse set of skills while keeping the setup within a small footprint. As is the case for real dog agility competitions, the Barkour benchmark can be easily adapted to a larger course area and may incorporate a variable number of obstacles and course configurations.

Overview of the Barkour benchmark’s obstacle course setup, which consists of weave poles, an A-frame, a broad jump, and pause tables. The intuitive scoring mechanism, inspired by dog agility competitions, balances speed, agility and performance and can be easily modified to incorporate other types of obstacles or course configurations.

Learning agile locomotion skills

The Barkour benchmark features a diverse set of obstacles and a delayed reward system, which pose a significant challenge when training a single policy that can complete the entire obstacle course. So in order to set a strong performance baseline and demonstrate the effectiveness of the benchmark for robotic agility research, we adopt a student-teacher framework combined with a zero-shot sim-to-real approach. First, we train individual specialist locomotion skills (teacher) for different obstacles using on-policy RL methods. In particular, we leverage recent advances in large-scale parallel simulation to equip the robot with individual skills, including walking, slope climbing, and jumping policies.

Next, we train a single policy (student) that performs all the skills and transitions in between by using a student-teacher framework, based on the specialist skills we previously trained. We use simulation rollouts to create datasets of state-action pairs for each one of the specialist skills. This dataset is then distilled into a single Transformer-based generalist locomotion policy, which can handle various terrains and adjust the robot’s gait based on the perceived environment and the robot’s state.

During deployment, we pair the locomotion transformer policy that is capable of performing multiple skills with a navigation controller that provides velocity commands based on the robot’s position. Our trained policy controls the robot based on the robot’s surroundings represented as an elevation map, velocity commands, and on-board sensory information provided by the robot.


Deployment pipeline for the locomotion transformer architecture. At deployment time, a high-level navigation controller guides the real robot through the obstacle course by sending commands to the locomotion transformer policy.

Robustness and repeatability are difficult to achieve when we aim for peak performance and maximum speed. Sometimes, the robot might fail when overcoming an obstacle in an agile way. To handle failures we train a recovery policy that quickly gets the robot back on its feet, allowing it to continue the episode.

Evaluation

We evaluate the Transformer-based generalist locomotion policy using custom-built quadruped robots and show that by optimizing for the proposed benchmark, we obtain agile, robust, and versatile skills for our robot in the real world. We further provide analysis for various design choices in our system and their impact on the system performance.

Model of the custom-built robots used for evaluation.

We deploy both the specialist and generalist policies to hardware (zero-shot sim-to-real). The robot’s target trajectory is provided by a set of waypoints along the various obstacles. In the case of the specialist policies, we switch between specialist policies by using a hand-tuned policy switching mechanism that selects the most suitable policy given the robot’s position.


Typical performance of our agile locomotion policies on the Barkour benchmark. Our custom-built quadruped robot robustly navigates the terrain’s obstacles by leveraging various skills learned using RL in simulation.

We find that very often our policies can handle unexpected events or even hardware degradation resulting in good average performance, but failures are still possible. As illustrated in the image below, in case of failures, our recovery policy quickly gets the robot back on its feet, allowing it to continue the episode. By combining the recovery policy with a simple walk-back-to-start policy, we are able to run repeated experiments with minimal human intervention to measure the robustness.


Qualitative example of robustness and recovery behaviors. The robot trips and rolls over after heading down the A-frame. This triggers the recovery policy, which enables the robot to get back up and continue the course.

We find that across a large number of evaluations, the single generalist locomotion transformer policy and the specialist policies with the policy switching mechanism achieve similar performance. The locomotion transformer policy has a slightly lower average Barkour score, but exhibits smoother transitions between behaviors and gaits.


Measuring robustness of the different policies across a large number of runs on the Barkour benchmark.

Histogram of the agility scores for the locomotion transformer policy. The highest scores shown in blue (0.75 – 0.9) represent the runs where the robot successfully completes all obstacles.

Conclusion

We believe that developing a benchmark for legged robotics is an important first step in quantifying progress toward animal-level agility. To establish a strong baseline, we investigated a zero-shot sim-to-real approach, taking advantage of large-scale parallel simulation and recent advancements in training Transformer-based architectures. Our findings demonstrate that Barkour is a challenging benchmark that can be easily customized, and that our learning-based method for solving the benchmark provides a quadruped robot with a single low-level policy that can perform a variety of agile low-level skills.

Acknowledgments

The authors of this post are now part of Google DeepMind. We would like to thank our co-authors at Google DeepMind and our collaborators at Google Research: Wenhao Yu, J. Chase Kew, Tingnan Zhang, Daniel Freeman, Kuang-Hei Lee, Lisa Lee, Stefano Saliceti, Vincent Zhuang, Nathan Batchelor, Steven Bohez, Federico Casarini, Jose Enrique Chen, Omar Cortes, Erwin Coumans, Adil Dostmohamed, Gabriel Dulac-Arnold, Alejandro Escontrela, Erik Frey, Roland Hafner, Deepali Jain, Yuheng Kuang, Edward Lee, Linda Luu, Ofir Nachum, Ken Oslund, Jason Powell, Diego Reyes, Francesco Romano, Feresteh Sadeghi, Ron Sloat, Baruch Tabanpour, Daniel Zheng, Michael Neunert, Raia Hadsell, Nicolas Heess, Francesco Nori, Jeff Seto, Carolina Parada, Vikas Sindhwani, Vincent Vanhoucke, and Jie Tan. We would also like to thank Marissa Giustina, Ben Jyenis, Gus Kouretas, Nubby Lee, James Lubin, Sherry Moore, Thinh Nguyen, Krista Reymann, Satoshi Kataoka, Trish Blazina, and the members of the robotics team at Google DeepMind for their contributions to the project.Thanks to John Guilyard for creating the animations in this post.

Google is killing a stolen YouTube feature

YouTube on smartphone stock photo 18

Credit: Edgar Cervantes / Android Authority
  • YouTube is killing Stories on June 26.
  • Stories that are already live on that date will expire seven days after they were originally shared.
  • YouTube now wants creators to focus on Community posts and Shorts.

Google is putting an end to YouTube Stories, a feature it borrowed from platforms like Snapchat in 2017. According to a support post made by the company, the feature will shut down on June 26, 2023. YouTube is now encouraging creators to use Community posts and Shorts instead of YouTube Stories.

-- Get the right stuff from a partner you trust. --

Partners

-- IT NEWS --

Blog

admin December 11th, 2025

Credit: Edgar Cervantes / Android Authority TL;DR Spotify is getting a new “Prompted Playlist” feature to help users create more […]

admin December 11th, 2025

TL;DR Four of the best Good Lock modules aren’t working properly on the One UI 8.5 beta. Home Up and […]

admin December 10th, 2025

This is an open thread. We want to hear from you! Share your thoughts in the comments and vote in […]