Skip to main content

Astronomy Generates Mountains of Data. That’s Perfect for AI

Consumer-grade AI is finding its way into people’s daily lives with its ability to generate text and images and automate tasks. But astronomers need much more powerful, specialized AI. The vast amounts of observational data generated by modern telescopes and observatories defies astronomers’ efforts to extract all of its meaning.

A team of scientists is developing a new AI for astronomical data called AstroPT. They’ve presented it in a new paper titled “AstroPT: Scaling Large Observation Models for Astronomy.” The paper is available at arxiv.org, and the lead author is Michael J. Smith, a data scientist and astronomer from Aspia Space.

Astronomers are facing a growing deluge of data, which will expand enormously when the Vera Rubin Observatory (VRO) comes online in 2025. The VRO has the world’s largest camera, and each of its images could fill 1500 large-screen TVs. During its ten-year mission, the VRO will generate about 0.5 exabytes of data, which is about 50,000 times more data than is contained in the USA’s Library of Congress.

The VRO's need for multiple sites to handle all of its data is a testament to the enormous volume of data it will generate. Without effective AI, that data will be stuck in a bottleneck. Image Credit: NOIRLab.
The VRO’s need for multiple sites to handle all of its data is a testament to the enormous volume of data it will generate. Without effective AI, that data will be stuck in a bottleneck. Image Credit: NOIRLab.

Other telescopes with enormous mirrors are also approaching first light. The Giant Magellan Telescope, the Thirty Meter Telescope, and the European Extremely Large Telescope combined will generate an overwhelming amount of data.

Having data that can’t be processed is the same as not having the data at all. It’s basically inert and has no meaning until it’s processed somehow. “When you have too much data, and you don’t have the technology to process it, it’s like having no data,” said Cecilia Garraffo, a computational astrophysicist at the Harvard-Smithsonian Center for Astrophysics.

This is where AstroPT comes in.

AstroPT stands for Astro Pretrained Transformer, where a transformer is a particular type of AI. Transformers can change or transform an input sequence into an output sequence. AI needs to be trained, and AstroPT has been trained on 8.6 million 512 x 512-pixel images from the DESI Legacy Survey Data Release 8. DESI is the Dark Energy Spectroscopic Instrument. DESI studies the effect of Dark Energy by capturing the optical spectra from tens of millions of galaxies and quasars.

AstroPT and similar AI deal with ‘tokens.’ Tokens are visual elements in a larger image that contain meaning. By breaking images down into tokens, an AI can understand the larger meaning of an image. AstroPT can transform individual tokens into coherent output.

AstroPT has been trained on visual tokens. The idea is to teach the AI to predict the next token. The more thoroughly it’s been trained to do that, the better it will perform.

“We demonstrated that simple generative autoregressive models can learn scientifically useful information when pre-trained on the surrogate task of predicting the next 16 × 16 pixel patch in a sequence of galaxy image patches,” the authors write. In this scheme, each image patch is a token.

This image illustrates how the authors trained AstroPT to predict the next token in a 'spiralised' sequence of galaxy image patches. It shows the token feed order. "As the galaxies are in the centre of each postage stamp, this set up allows us to seamlessly pretrain and run inference on differently sized galaxy postage stamps," the authors explain. Image Credit: Smith et al. 2024.
This image illustrates how the authors trained AstroPT to predict the next token in a ‘spiralised’ sequence of galaxy image patches. It shows the token feed order. “As the galaxies are in the centre of each postage stamp, this set up allows us to seamlessly pretrain and run inference on differently sized galaxy postage stamps,” the authors explain. Image Credit: Smith et al. 2024.

One of the obstacles to training AI like AstroPT concerns what AI scientists call the ‘token crisis.’ To be effective, AI needs to be trained on a large number of quality tokens. In a 2023 paper, a separate team of researchers explained that a lack of tokens can limit the effectiveness of some AI, such as LLMs or Large Language Models. “State-of-the-art LLMs require vast amounts of internet-scale text data for pre-training,” the wrote. “Unfortunately, … the growth rate of high-quality text data on the internet is much
slower than the growth rate of data required by LLMs.”

AstroPT faces the same problem: a dearth of quality tokens to train on. Like other AI, it uses LOMs or Large Observation Models. The team says their results so far suggest that AstroPT can solve the token crisis by using data from observations. “This is a promising result that suggests that data taken from the observational sciences would complement data from other domains when used to pre-train a single multimodal LOM, and so points towards the use of observational data as one solution to the ‘token crisis’.”

AI developers are eager to find solutions to the token crisis and other AI challenges.

Without better AI, a data processing bottleneck will prevent astronomers and astrophysicists from making discoveries from the vast quantities of data that will soon arrive. Can AstroPT help?

The authors are hoping that it can, but it needs much more development. They say they’re open to collaborating with others to strengthen AstroPT. To aid that, they followed “current leading community models” as closely as possible. They call it an “open to all project.”

“We took these decisions in the belief that collaborative community development paves the fastest route towards realising an open source web-scale large observation model,” they write.

“We warmly invite potential collaborators to join us,” they conclude.

It’ll be interesting to see how AI developers will keep up with the vast amount of astronomical data coming our way.

The post Astronomy Generates Mountains of Data. That’s Perfect for AI appeared first on Universe Today.



from Universe Today https://ift.tt/2qcJzoY
via IFTTT

Comments

Popular posts from this blog

Newcastle boss Eddie Howe pours cold water on moves for Neymar and Cristiano Ronaldo

Newcastle United's newly appointed manager, Eddie Howe, has dismissed rumors linking the club with high-profile signings such as Neymar and Cristiano Ronaldo. In an interview with Sky Sports, Howe stated that while he is always looking to strengthen his squad, he believes that signing players of that caliber would not be realistic at this time. "We have to be realistic about what we can achieve in the transfer market," said Howe. "While we would love to sign players like Neymar or Ronaldo, the reality is that it would not be feasible for us at this moment in time." Howe went on to explain that Newcastle United is currently in a rebuilding phase, and that his focus is on building a solid foundation for the future. "We have to be patient and build something sustainable here," he said. "We can't just throw money at big-name players and hope that it will solve all our problems. We need to build a team that can compete at the highest level, and tha...

PUBG Mobile MOD APK v2.2.0 (Unlimited UC, AimBot)

PUBG MOD APK  is available to download below. Now you can download every latest version of  PUBG Mobile MOD APK  in just two minutes. Read the whole post and get Hacked APK with Unlimited UC & AimBot features. Being here, you explain to us the whole thing. You are a PUBG lover and now want to play the Pubg hack version. So, guys, this post will be fascinating for you Because, in this post, you will learn how to download  Pubg Mobile Mod APK  For Android. We will also cover its features and complete essential details you should know in this post. Guys, every gamer who plays the game is aware of PUBG Mobile APK, and now most are aware of  PUBG Mod APK . Let me explain the difference between Pubg Original APK And Pubg Mobile Hack APK. Word I added hack after PUBG explains it. This modified version of PUBG will get you extra control over this game. You can get Aimbot, No Recoil, Unlimited Uc (Anti Ban), etc., by Installing this Mod Apk. In the last post, I ...

INDIA vs PAKISTAN Live Match | Live Score & Commentary | IND vs PAK Live...

   #INDvsPAK #IndiaVsPakistan #PAKvIND #Cricket #ViratKohliand#BabarAzam #live #viral #cricketmatch #Tensport #Ptvsport