Deep Visual-Semantic Alignments for Generating Image Descriptions

Because of the Nov. 14th submission  deadline for this years IEEE Conference on Computer Vision and Pattern Recognition (CVPR) several big image-recognition papers are coming out this week: From Andrej Karpathy and Li Fei-Fei of Stanford: We present a model that generates free-form natural language descriptions of image regions. Our model leverages datasets of images and their sentence descriptions to learn about the inter-modal correspondences between text and visual data. Our approach is based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding. We then describe a Recurrent Neural Network architecture that uses the inferred alignments to learn to generate novel descriptions of image regions. We demonstrate the effectiveness of our alignment model with ranking experiments on Flickr8K, Flickr30K and COCO datasets, where we substantially improve on the state of the art. We then show that the sentences created by our generative model outperform retrieval baselines on the three aforementioned datasets and a new dataset of region-level annotations... ( website with examples ) ( full paper ) From Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan at Google: Show and Tell: A Neural Image Caption Generator  ( announcement post ) ( full paper ) From Ryan Kiros, Ruslan Salakhutdinov, Richard S. Zemel at University of Toronto: Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models  ( full paper ) From Junhua Mao, Wei Xu, Yi Yang, Jiang Wang and Alan L. Yuille at Baidu Research/UCLA: Explain Images with Multimodal Recurrent Neural Networks  ( full paper ) From Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell at UT Austin, UMass Lowell and UC Berkeley: Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( full paper ) All these came from this Hacker News discussion .

Teledyne DALSA Displays Industrial Vision Solutions at Rockwell's Automation Fair 2014

Automation Fair will take place from November 19-20, 2014 at the Anaheim Convention Center, Anaheim, CA.

Researchers Make Self-Learning Robots using 3D Printers

On the third floor of the Department of Informatics there is a robotics laboratory which looks like a playroom This is where researchers are testing how their robots can figure out how to move past barriers and other obstacles.

Opto Diode's New Quadrant Photodiode 5 mm² - SXUVPS4

The SXUVPS4 photodiodes are ideal for laser alignment applications

Trust Automation Receives Fourth Contract to Provide Motion Control System for Counterfire Radar Production

In today's rapidly evolving security landscape of unconventional battlefields and irregular warfare, our soldiers need to quickly locate and neutralize mortar and rocket threats.

Schunk - Cleanroom Gripper for Small to Medium Part Handling

The EGS from SCHUNK is perfect for the gripping of small to medium sized workpieces, with flexible force and high speed in clean environments, such as assembly, testing, laboratory, and the pharmaceutical industry.

New Dual Wideband Transceiver from VadaTech in FMC Format

VadaTech, a manufacturer of embedded boards and complete application-ready platforms, has announced a dual wideband transceiver in the FPGA Mezzanine Card (FMC) format. The board has a frequency range of 70 MHz to 6 GHz and is ideal for LTE, SDR, and other advanced RF applications.

Microscan Offers Advanced Optical Character Recognition for Machine Vision with New IntelliText OCR Tool

Microscan, a global technology leader in barcode, machine vision, and lighting solutions, announces the latest innovation from its award-winning AutoVISION® machine vision suite: IntelliText OCR. Teach your machines to read rotated, damaged, or low-contrast text. Get more intelligent text recognition on parts and packages. Adapt to any print, lighting, or surface condition with IntelliText OCR.

IAR Systems Enables Development for Low-Power Sensor Processing Based on New NXP Dual-Core Series

IAR Embedded Workbench is the ultimate choice for always-on sensor applications for the Internet of Things along with the LPC54100 series

Fabtech - Metcam Sends 100 Employees to FABTECH This Week and Presents Educational Session on Reducing Environmental Footprint

Leading Fabricator Boasts the Largest Corporate Delegation at Principal Metal Industry Event in North America

Innodisk Expands Embedded Peripheral Series with Communication Modules, for IoT and Embedded Applications.

Expansion Modules add Gigabit-Ethernet or Multiport Serial Connectivity

New Kinetix 5500 Servo Drive With Integrated Safety Enhances Machine Performance and Flexibility

Complete integration in Logix Designer for safety, motion and control

Patented DPFlex II Sensorless Brushless Drive Exceeds Performance of Hall-Based Drives

Allied Motion's innovative new DPFlex II sensorless brushless drive hits market

Microscan Introduces EtherNet/IP™ for the World's Smallest Machine Vision Smart Camera

Microscan, a global technology leader in barcode, machine vision, and lighting solutions, announces the availability of EtherNet/IP™ communication on its ultra-compact machine vision smart camera, Vision MINI Xi.

New 6-Slot MicroTCA.4 Chassis Features High Performance Density in 2U Height

VadaTech, a manufacturer of embedded boards and complete application-ready platforms, now offers a 2U chassis platform that complies to the MicroTCA.4 specification for High Energy Physics and other applications requiring rear IO. The rear IO capability is an attractive option in many Mil/Aero, Physics, Broadcast, Energy, and Network Security designs.

Records 3616 to 3630 of 4112

First | Previous | Next | Last

Factory Automation - Featured Product

Discover how human-robot collaboration can take flexibility to new heights!

Discover how human-robot collaboration can take flexibility to new heights!

Humans and robots can now share tasks - and this new partnership is on the verge of revolutionizing the production line. Today's drivers like data-driven services, decreasing product lifetimes and the need for product differentiation are putting flexibility paramount, and no technology is better suited to meet these needs than the Omron TM Series Collaborative Robot. With force feedback, collision detection technology and an intuitive, hand-guided teaching mechanism, the TM Series cobot is designed to work in immediate proximity to a human worker and is easier than ever to train on new tasks.