HP Aims To Boot ‘Useless’ Data
Hewlett-Packard wants to help organizations rid themselves of useless data, all the information that is no longer necessary, yet still occupies expensive space on storage servers.
The company’s Autonomy unit has released a new module, called Autonomy Legacy Data Cleanup, that can delete data automatically based on the material’s age and other factors, according to Joe Garber, who is the Autonomy vice president of information governance.
Hewlett-Packard announced the new software, along with a number of other updates and new services, at its HP Discover conference, being held this week in Las Vegas.
For this year’s conference, HP will focus on “products, strategies and solutions that allow our customers to take command of their data that has value, and monetize that information,” said Saar Gillai, HP’s senior vice president and general manager for the converged cloud.
The company is pitching Autonomy Legacy Data Cleanup for eliminating no-longer-relevant data in old SharePoint sites and in e-mail repositories. The software requires the new version of Autonomy’s policy engine, ControlPoint 4.0.
HP Autonomy Legacy Data Cleanup evaluates whether to delete a file based on several factors, Garber said. One factor is the age of the material. If an organization has an information governance policy of only keeping data for seven years, for example, the software will delete any data older than seven years. It will root out and delete duplicate data. Some data is not worth saving, such as system files. Those can be deleted as well. It can also consider how much the data is being accessed by employees: Less consulted data is more suitable for deletion.
Administrators can set other controls as well. If used in conjunction with the indexing and categorization capabilities in Autonomy’s Idol data analysis platform, the new software can eliminate clusters of data on a specific topic. “You apply policies to broad swaths of data based on some conceptual analysis you are able to do on the back end,” Garber said.
The First PC Had a Birthday
The year was 1981 and IBM introduced its IBM PC model 5150 on August 12th, 30 years ago today.
The first IBM PC wasn’t much by today’s standards. It had an Intel 8088 processor that ran at the blazing speed of 4.77MHz. The base memory configuration was all of 16kB expandable all the way up to 256kB, and it had two 5-1/4in, 160kB capacity floppy disk drives but no hard drive.
A keyboard and 12in monochrome monitor were included, with a colour monitor optional. The 5150 ran IBM BASIC in ROM and came with a PC-DOS boot diskette put out by a previously unknown startup software company based out of Seattle named Microsoft.
IBM priced its initial IBM PC at a whopping $1,565, and that was a relatively steep price in those days, worth about $5,000 today, give or take a few hundred dollars. In the US in 1981 that was about the cost of a decent used car.
Because the IBM PC was meant to be sold to the general public but IBM didn’t have any retail stores, the company sold it through US catalogue retailer Sears & Roebuck stores.
Subsequently IBM released follow-on models through 1986 including the PC/XT, the first with an internal hard drive; the PC/AT with an 80286 chip running at 6MHz then 8MHz; the 6MHz XT/286 with zero wait-state memory that was actually faster than the 8MHz PC/AT and (not very) Portable and Convertible models; as well as the ill-fated XT/370, AT/370, 3270 PC and 3270/AT mainframe terminal emulators, plus the unsuccessful PC Jr.
IBM Debuts Fast Storage System
With an eye toward helping tomorrow’s data intensive organizations, IBM researchers have developed a super-fast storage system capable of scanning in 10 billion files in 43 minutes.
This system easily bested their previous system, demonstrated at Supercomputing 2007, which scanned 1 billion files in three hours.
Key to the increased performance was the use of speedy flash memory to store the metadata that the storage system uses to locate requested information. Traditionally, metadata repositories reside on disk, access to which slows operations.
“If we have that data on very fast storage, then we can do those operations much more quickly,” said Bruce Hillsberg, director of storage systems at IBM Research Almaden, where the cluster was built. “Being able to use solid-state storage for metadata operations really allows us to do some of these management tasks more quickly than we could ever do if it was all on disk.”
IBM foresees that its customers will be grappling with a lot more information in the years to come.
“As customers have to store and process large amounts of data for large periods of time, they will need efficient ways of managing that data,” Hillsberg said.
For the new demonstration, IBM built a cluster of 10 eight-core servers equipped with a total of 6.8 terabytes of solid-state memory. IBM used four 3205 solid-state Storage Systems from Violin Memory. The resulting system was able to read files at a rate of almost 5 GB/s (gigabytes per second).