Search
Patexia Research
Patent No. US 11227063
Issue Date Jan 18, 2022
Claim this patent
PDF Unavailable

Patent 11227063 - User experience using privatized crowdsourced data > Claims

  • 1. A machine-readable medium storing instructions which, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising: selecting a value of user data to transmit to a server from a set of possible user data values collected on a client device;creating a hash of a term for the value of user data using a random hash function, wherein the random hash function is generated from a randomly selected variant of the term, and wherein a set of possible variants of the term is indexed;encoding at least a portion of the hash of the term for the value as a vector, wherein the encoding includes updating a value of the vector at a position corresponding to the hash;privatizing the vector by changing at least some of the values of the vector with a predefined probability; andtransmitting, to the server, the privatized vector and an index value of the randomly selected variant to enable the server to estimate of a frequency of the value of user data on a set of client devices.
    • 2. The machine-readable medium of claim 1, wherein: the term is a string of characters and the randomly selected variant of the term includes one or more characters representing the index value of the variant appended to the string;the server estimates the frequency of the value of user data by updating a frequency table indexed by the set of possible variants and a row or column of the frequency table corresponding to the index value of the random hash function is updated with the privatized vector; andthe server estimates a frequency of each of the set of possible user data values amongst from data accumulated from the set of client devices.
    • 3. The machine-readable medium of claim 1, wherein the encoding includes initializing the vector with a uniform value and sign and updating the value of the vector includes flipping the sign of the value at the position corresponding to a created hash value.
      • 4. The machine-readable medium of claim 3, wherein initializing the vector includes setting each value of the vector to a value representing a constant and updating the value of the vector includes setting the value at the position corresponding to a created hash value to a value representing a sign flip of the constant.
    • 5. The machine-readable medium of claim 1, wherein the random hash function is to address hash collisions when using only the portion of a created hash value and to reduce a number of computations required to create a frequency table while maintaining privacy of the user data values.
    • 6. The machine-readable medium of claim 1, wherein the encoded vector is a Hadamard matrix and the value of user data represents a website visited by a user of the client device.
    • 7. The machine-readable medium of claim 1, wherein only the privatized vector and the index value of the randomly selected variant is transmitted to the server as information representing the value of user data.
    • 8. The machine-readable medium of claim 1, wherein privatizing the vector includes changing at least some of the values of the vector with a predefined probability, the predefined probability based on a privacy parameter.
      • 9. The machine-readable medium of claim 8, wherein the privacy parameter represents a configured tradeoff between privacy and accuracy.
      • 10. The machine-readable medium of claim 8, wherein the predefined probability is defined as 1/(1+eε), and ε is the privacy parameter.
  • 11. An electronic device comprising: one or more processors; anda memory coupled to the one or more processors, the memory storing instructions, which when executed by the one or more processor, cause the one or more processors to perform operations comprising: selecting a value of user data to transmit to a server from a set of possible user data values collected on a client device;creating a hash of a term for the value of user data using a random hash function, wherein the random hash function is generated from a randomly selected variant of the term, and wherein a set of possible variants of the term is indexed;encoding at least a portion of the hash of the term for the value as a vector, wherein the encoding includes updating a value of the vector at a position corresponding to the hash;privatizing the vector by changing at least some of the values of the vector with a predefined probability; andtransmitting, to the server, the privatized vector and an index value of the randomly selected variant to enable the server to estimate of a frequency of the value of user data on a set of client devices.
    • 12. The electronic device of claim 11, wherein: the term is a string of characters and the randomly selected variant of the term includes one or more characters representing the index value of the variant appended to the string;the server estimates the frequency of the value of user data by updating a frequency table indexed by the set of possible variants and a row or column of the frequency table corresponding to the index value of the random hash function is updated with the privatized vector; andthe server estimates a frequency of each of the set of possible user data values amongst from data accumulated from the set of client devices.
    • 13. The electronic device of claim 11, wherein the encoding includes initializing the vector with a uniform value and sign and updating the value of the vector includes flipping the sign of the value at the position corresponding to a created hash value.
      • 14. The electronic device of claim 13, wherein initializing the vector includes setting each value of the vector to a value representing a constant and updating the value of the vector includes setting the value at the position corresponding to a created hash value to a value representing a sign flip of the constant.
    • 15. The electronic device of claim 11, wherein the random hash function is to address hash collisions when using only the portion of a created hash value and to reduce a number of computations required to create a frequency table while maintaining privacy of the user data values.
    • 16. The electronic device of claim 11, wherein the encoded vector is a Hadamard matrix and the value of user data represents a website visited by a user of the client device.
    • 17. The electronic device of claim 11, wherein only the privatized vector and the index value of the randomly selected variant is transmitted to the server as information representing the value of user data.
    • 18. The electronic device of claim 11, wherein privatizing the vector includes changing at least some of the values of the vector with a predefined probability, the predefined probability based on a privacy parameter.
      • 19. The electronic device of claim 18, wherein the privacy parameter represents a configured tradeoff between privacy and accuracy.
      • 20. The electronic device of claim 18, wherein the predefined probability is defined as 1/(1+eε), and ε is the privacy parameter.
  • 21. A method comprising: selecting a value of user data to transmit to a server from a set of possible user data values collected on a client device;creating a hash of a term for the value of user data using a random hash function, wherein the random hash function is generated from a randomly selected variant of the term, and wherein a set of possible variants of the term is indexed;encoding at least a portion of the hash of the term for the value as a vector, wherein the encoding includes updating a value of the vector at a position corresponding to the hash;privatizing the vector by changing at least some of the values of the vector with a predefined probability; and transmitting, to the server, the privatized vector and an index value of the randomly selected variant to enable the server to estimate of a frequency of the value of user data on a set of client devices.
    • 22. The method of claim 21, wherein: the term is a string of characters and the randomly selected variant of the term includes one or more characters representing the index value of the variant appended to the string;the server estimates the frequency of the value of user data by updating a frequency table indexed by the set of possible variants and a row or column of the frequency table corresponding to the index value of the random hash function is updated with the privatized vector; andthe server estimates a frequency of each of the set of possible user data values amongst from data accumulated from the set of client devices.
    • 23. The method of claim 21, wherein the encoding includes initializing the vector with a uniform value and sign and updating the value of the vector includes flipping the sign of the value at the position corresponding to a created hash value.
Menu