| US 7,475,085 B2 | ||
| Method and apparatus for privacy preserving data mining by restricting attribute choice | ||
| Charu C. Aggarwal, Mohegan Lake, N.Y. (US); and Nagui Halim, Yorktown Heights, N.Y. (US) | ||
| Assigned to International Business Machines Corporation, Armonk, N.Y. (US) | ||
| Filed on Apr. 04, 2006, as Appl. No. 11/397,297. | ||
| Prior Publication US 2007/0233711 A1, Oct. 04, 2007 | ||
| Int. Cl. G06F 17/30 (2006.01) | ||
| U.S. Cl. 707—101 [707/1; 707/2; 707/9; 707/100; 707/103 R] | 21 Claims |

| 1. A method of generating at least one output data set from at least one input data set for use in association with a data
mining process, the input data set comprising at least one entry including each of a plurality of attributes, comprising the
steps of:
determining at least one relevance coefficient for at least a subset of the plurality of attributes;
selecting at least one relevant attribute of the at least one input data set based at least in part on the at least one relevance
coefficient; and
generating the at least one output data set from the at least one input data set;
wherein the at least one output data set comprises at least one entry not including at least one of the plurality of attributes;
and
wherein the at least one entry of the output data set has the at least one relevant attribute of the at least one input data
set;
wherein the at least one relevance coefficient is computed using a quantitative measure of an effect on the data mining process
of a deletion of at least the given attribute from each entry of the input data set.
|