LSTMs: Smarter than they appear:
…Turns out you don’t need to use a Transformer to develop rich, combinatorial representations…
Long Short-Term Memory networks are one of the widely-used deep learning architectures. Until recently, if you wanted to develop sophisticated natural language understanding AI systems, you’d use an LSTM. Then in the past couple of years, people have started switching over to using the ‘Transformer’ architecture because it comes with inbuilt attention, which lets it smartly analyze long-range dependencies in data.
Now, researchers from the University of Edinburgh have studied how LSTMs learn long-range dependencies; LSTMs figure out how to make predictions about long sequences by learning patterns in short sequences then using these patterns as ‘scaffolds’ for learning longer, more complex ones. “Acquisition is biased towards bottom-up learning, using the constituent as a scaffold to support the long-distance rule,” they write. “These results indicate that predictable patterns play a vital role in shaping the representations of symbols around them by composing in a way that cannot be easily linearized as a sum of the component parts”.
The goldilocks problem: However, this form of learning has some drawbacks – if you get your data mix one, the LSTM might quickly learn how to solve shorter sequences, but fail to generalize to longer ones. If you make its training distribution too hard, it might find it hard to learn at all.
Why this matters – more human than you think: In recent years, one of the more surprising trends in AI has been in identifying surface-level commonalities between our AI systems and how they learn, and how people learn. This study of the LSTM provides some slight evidence that these networks, though basic, learn via some similarly rich, additive procedures as people. ‘The LSTM’s demonstrated inductive bias towards hierarchical structures is implicitly aligned with our understanding of language and emerges from its natural learning process,” they write.
Read more:LSTMs Compose (and Learn) Bottom-Up (arXiv).
###################################################
Google speeds up AI chip design by 8.6X with new RL training system:
…Menger: The machine that learns the machines…
Google has developed Menger, software that lets the company train reinforcement learning systems at a large scale. This is one of those superficially dull announcements which is surprisingly significant. That’s because RL, while useful, is currently quite expensive in terms of computation; therefore, RL benefits from compute, which requires being able to run a sophisticated learning system at a large scale. That’s what Menger is designed to do – in tests, Google says it has used Menger to reduce the time it takes the company to train RL for a chip placement task by 8.6x – cutting the training time for the task from 8.6 hours to one hour (when using 512 CPU cores).
Why this matters: Google is at the beginning of building an AI flywheel – that is, a suite of complementary bits of AI software which can accelerate Google’s broader software development practices. Menger will help Google more efficiently train AI systems, and Google will use that to do things like develop more efficient computers (and sets of computers) via chip placement, and these new computers will then be used to train and develop successive systems. Things are going to accelerate rapidly from here.
Read more:Massively Large-Scale Distributed Reinforcement Learning with Menger (Google AI Blog).
###################################################
Access Now leaves PAI:
…Human Rights VS Corporate Rights…
Civil society organization Access Now is leaving the Partnership on AI, a multi-stakeholder group (with heavy corporate participation) that tries to bring people together to talk about AI and its impact on society.
Talking is easy, change is hard: Over its lifetime, PAI has accomplished a few things, but one of the inherent issues with the org is ‘it is what people make of it’ – which means that for many of the corporations, they treat it like an extension of their broader public relations and government affairs initiatives. “While we support dialogue between stakeholders, we did not find that PAI influenced or changed the attitude of member companies or encouraged them to respond to or consult with civil society on a systematic basis,” Access Now said in a statement.
Why this matters: In the absence of informed and effective regulators, society needs to figure out the rules of the road for AI development. PAI is an organization that’s meant to play that role, but Access Now’s experience illustrates the difficulty in a single org being able to deal with structural inequities which make some of its members very powerful (e.g, the tech companies), and some of them comparatively weaker.
Read more:Access Now resigns from the Partnership on AI (Access Now official website).
###################################################
Can AI tackle climate change? Facebook and CMU think so:
…Science, meet function approximation…
Researchers with Facebook and Carnegie Mellon University have built a massive dataset to help researchers develop ML systems that can help us discover good electrocatalysts for use in renewable energy storage technologies. The Open Catalyst Dataset contains 1.2 million molecular relaxations (stable low-energy states) with results from over 250 million DFT (density functional theory) calculations.
DFT, for those not familiar with the finer aspects of modeling the essence of the universe, is a punishingly expensive way to model fine-grained interactions (e.g, molecular reactions). DFT simulations can take “hours–weeks per simulation using O(10–100) core CPUs on structures containing O(10–100) atoms,” Facebook writes. “As a result, complete exploration of catalysts using DFT is infeasible. DFT relaxations also fail more often when the structures become large (number of atoms) and complex”.
Therefore, the value of what Facebook and CMU have done here is they’ve eaten the cost of a bunch of DFT simulations and used this to release a rich dataset, which ML researchers can use to train ML systems to approximate this data. Maybe that sounds dull to you, but this is literally a way to drastically reduce the cost of a branch of science that is existential to the future of the earth, so I think it’s pretty interesting!
Why this matters: Because deep learning systems can learn complex functions then approximate them, we’re going to see people use them more and more to carry out unfeasibly expensive scientific exercises – like attempting to approximate highly complex chemical interactions. In this respect, Open Catalyst sits alongside projects like DeepMind’s ‘AlphaFold’ (Import AI 209, 189), or earlier work like ‘ChemNet’ which tries to pre-train systems on large chemistry datasets then apply them to smaller ones (Import AI 72).
Read more:Open Catalyst Project (official website).
Get the datafor Open Catalyst here (GitHub).
Read the paper:An Introduction to Electrocatalyst Design using Machine Learning for Renewable Energy Storage (PDF).
Read more:The Open Catalyst 2020 (OC20) Dataset and Community Challenges (PDF).
###################################################
Google X builds field-grokking robot to analyze plantlife:
…Sometimes, surveillance is great!…
Google has revealed Project Mineral, a Google X initiative to use machine learning to analyze how plants grow in fields and to make farmers more efficient. As part of this, Google has built a small robot buggy that patrols these fields, using cameras paired with onboard AI systems to do on-the-fly analysis of the plants beneath the roving robots.
“What if we could measure the subtle ways a plant responds to its environment? What if we could match a crop variety to a parcel of land for optimum sustainability? We knew we couldn’t ask and answer every question — and thanks to our partners, we haven’t needed to. Breeders and growers around the world have worked with us to run experiments to find new ways to understand the plant world,” writes Elliott Grant, who works at Google X.
Why this matters: AI gives us new tools to digitize the world. Digitization is useful because it lets us point computers at various problems and get them to operate over larger amounts of data than a single human can comprehend. Project Mineral is a nice example of applied machine learning ‘in the field’ – haha!
Read more: Project Mineral (Google X website).
Read more: Mineral: Bringing the era of computational agriculture to life (Google X blog, 2020).
Read more: Entering the era of computational agriculture (Google X blog, 2019).
###################################################
Tech Tales:
[2040]
Ghosts
When the robots die, we turn them into ghosts. It started out as good scientific practice – if you’re retiring a complex AI system, train a model to emulate it, then keep that model on a hard drive somewhere. Now you’ve got a version of your robot that’s like an insect preserved in amber – it can’t move, update itself in the world, or carry out independent actions. But it can still speak to you, if you access its location and ask it a question.
There’ve been a lot of nicknames for the computers where we keep the ghosts. The boneyard. Heaven. Hell. Yggdrasil. The Morgue. Cold Storage. Babel. But these days we call it ghostworld.
Not everyone can access a ghost, of course. That’d be dangerous – some of them know things that are dangerous, or can produce things that can be used to accomplish mischief. But we try to keep it as accessible as possible.
Recently, we’ve started to let our robots speak to their ghosts. Not all of them, of course. In fact, we let the robots only access a subset of the robots that we let most people access. This started out as another scientific experiment – what if we could give our living robots the ability to go and speak to some of their predecessors. Could they learn things faster? Figure stuff out?
Yes and no. Some robots drive themselves mad when they talk to the dead. Others grow more capable. We’re still not sure about which direction a given robot will take, when we let it talk to the ghosts. But when they get more capable after their conversations with their forebears, they do so in ways that we humans can’t quite figure out. The robots are learning from their dead. Why would we expect to be able to understand this?
There’s been talk, recently, of combining ghosts. What if we took a load of these old systems and re-animated them in a modern body – better software, more access to appendages like drones and robot arms, internet links, and so on. Might this ghost-of-ghosts start taking actions in the world quite different to those of its forebears, or those of the currently living systems? And how would the robots react if we let their dead walk among them again?
We’ll do it, of course. We’re years away from figuring out human immortality – how to turn ourselves into our own ghosts. So perhaps we can be necromancers with our robots and they will teach us something about ourselves? Perhaps death and the desire to learn from it and speak with it can become something our two species have in common.
Things that inspired this story: Imagining neural network archives; morgues; the difference between ‘alive’ agents that are continuously learning and ones that are static or can be made static.
via https://AIupNow.com
Jack Clark, Khareem Sudlow