做过金融项目或者政府项目的大佬肯定都实现过加密数据的模糊查询功能,大家都是怎么实现的呢?今天就来简单举例一些实现加密数据模糊查询的方案。方案1:应用层内存过滤
原理
全表查询加密数据
在内存中解密后过滤
public List<User> unsafeSearch(String keyword) { List<User> allUsers = userRepository.findAll(); return allUsers.stream() .filter(user -> decrypt(user.getName()).contains(keyword)) .collect(Collectors.toList());}
致命缺陷:
方案2:数据库函数解密
MySQL示例
SELECT * FROM users WHERE AES_DECRYPT(name_encrypted, 'secret_key') LIKE '%张%';
优缺点:
✅ 不改动业务代码
❌ 密钥暴露在数据库
❌ 无法使用索引(全表扫描)
Java安全写法:
@Query(nativeQuery = true, value = "SELECT * FROM users WHERE " + "CONVERT(AES_DECRYPT(name_encrypted, ?1) USING utf8) LIKE ?2")List<User> searchByName(String key, String keyword);
方案3:分词+密文索引(推荐)
1. 核心思路
2. 代码实现
(1). 增强版分词策略
<dependency> <groupId>com.janeluo</groupId> <artifactId>ikanalyzer</artifactId> <version>2012_u6</version></dependency>
public class EnhancedTokenizer { public static List<String> smartTokenize(String text) { List<String> tokens = new ArrayList<>(); try (StringReader reader = new StringReader(text)) { IKSegmenter seg = new IKSegmenter(reader, true); Lexeme lex; while ((lex = seg.next()) != null) { tokens.add(lex.getLexemeText()); if (lex.getLength() > 1) { for (int i = 2; i <= lex.getLength(); i++) { tokens.add(text.substring(lex.getBeginPosition(), lex.getBeginPosition() + i)); } } } } return tokens.stream().distinct().collect(Collectors.toList()); }}
2. 双重加密索引表
@Entity@Table(name = "user_search_index")public class SearchIndex { @Id @GeneratedValue private Long id;
@Column(name = "user_id") private Long userId;
@Column(name = "token_hash") private String tokenHash;
@Column(name = "token_cipher") private String tokenCipher;
public static SearchIndex create(Long userId, String token) throws Exception { SearchIndex index = new SearchIndex(); index.setUserId(userId); index.setTokenHash(AESUtil.encryptWithKey(token, "HASH_KEY")); index.setTokenCipher(AESUtil.encryptWithKey(token, "CIPHER_KEY")); return index; }}
3. 智能查询流程
public List<User> secureFuzzySearch(String keyword) throws Exception { List<String> tokens = EnhancedTokenizer.smartTokenize(keyword);
List<String> hashList = tokens.parallelStream() .map(t -> { try { return AESUtil.encryptWithKey(t, "HASH_KEY"); } catch (Exception e) { throw new RuntimeException(e); } }) .collect(Collectors.toList());
Pageable page = PageRequest.of(0, 100); List<Long> userIds = indexRepository .findUserIdsByTokenHashes(hashList, page);
return userRepository.findAllById(userIds).stream() .filter(user -> { try { String decrypted = AESUtil.decrypt(user.getNameCipher()); return decrypted.contains(keyword); } catch (Exception e) { return false; } }) .collect(Collectors.toList());}
方案4:同态加密(学术级方案)
基于SEAL库的实现
<dependency> <groupId>com.microsoft.seal</groupId> <artifactId>seal-jni</artifactId> <version>3.7.2</version></dependency>
public class HomomorphicSearch { public static void demo() throws Exception { EncryptionParameters params = new EncryptionParameters(SchemeType.bfv); params.setPolyModulusDegree(4096); params.setCoeffModulus(CoeffModulus.BFVDefault(4096)); params.setPlainModulus(PlainModulus.Batching(4096, 20));
SEALContext context = new SEALContext(params);
KeyGenerator keyGen = new KeyGenerator(context); PublicKey publicKey = keyGen.createPublicKey(); SecretKey secretKey = keyGen.secretKey();
Encryptor encryptor = new Encryptor(context, publicKey); Plaintext plain = new Plaintext("123"); Ciphertext encrypted = new Ciphertext(); encryptor.encrypt(plain, encrypted);
Evaluator evaluator = new Evaluator(context); Ciphertext result = new Ciphertext(); evaluator.add(encrypted, encrypted, result);
Decryptor decryptor = new Decryptor(context, secretKey); Plaintext decrypted = new Plaintext(); decryptor.decrypt(result, decrypted); System.out.println(decrypted); }}
现实限制:
仅支持有限运算(加减/乘)
性能差(单次操作需100ms+)
不支持字符串操作
方案5:可信执行环境(TEE)
基于Intel SGX的解决方案
void ecall_fuzzy_search(const char* encrypted_query, size_t len) { std::string query = decrypt_in_enclave(encrypted_query, len);
sqlite3* db; sqlite3_open(":memory:", &db); std::string sql = "SELECT * FROM users WHERE name LIKE '%" + query + "%'";
encrypt_results(db_exec(sql));}
优势:
硬件级安全
性能接近明文查询
挑战:
需要特定硬件支持
开发复杂度高
阅读原文:原文链接
该文章在 2025/6/14 16:36:35 编辑过