好的！为了便于理解，我将为您提供一篇关于封装哈希表实现 unordered_map 和 unordered_set 的详细文章。虽然 5000 字的内容比较多，下面将分步骤进行展示，包含代码示例、使用场景以及详细的说明。

封装哈希表实现 `unordered_map` 和 `unordered_set`

哈希表（Hash Table）是一种常用的数据结构，它通过哈希函数将数据映射到一个固定大小的数组中，从而使得数据的插入、删除和查找操作平均时间复杂度为 O(1)。哈希表的变种 unordered_map 和 unordered_set 在 C++ 标准库中非常常见，分别用于存储键值对和仅存储唯一元素的集合。

本文将介绍如何封装实现一个基本的哈希表，并构建出类似 unordered_map 和 unordered_set 的数据结构。我们将深入探讨其内部实现、使用场景以及性能分析。

1. 哈希表基础

1.1 哈希表的基本概念

哈希表是一种基于数组实现的映射数据结构，其核心思想是通过哈希函数将数据映射到数组的某个位置。哈希表的优势在于查找、插入、删除操作的时间复杂度通常为 O(1)，这是因为哈希函数能够将数据均匀地分配到数组中。

哈希表的操作有以下几种：

插入（Insert）：将元素插入哈希表。
查找（Search）：根据键值查找对应的元素。
删除（Delete）：根据键值删除元素。

当多个元素的哈希值相同（即发生哈希冲突）时，哈希表通常采用链地址法（Chaining）或开放地址法（Open Addressing）来解决冲突问题。

1.2 哈希函数

哈希函数是哈希表的核心，它决定了如何将键映射到哈希表的槽（Slot）。哈希函数的设计需要保证冲突的概率尽可能低，并且能够均匀分布数据。一个好的哈希函数应该满足以下要求：

均匀性：哈希函数应该尽量使所有的键值均匀分布到哈希表中。
快速性：哈希函数的计算应该尽量快速，以避免成为性能瓶颈。

常见的哈希函数有：std::hash，MD5，SHA1 等。

2. 封装哈希表实现 `unordered_map`

在 C++ 中，unordered_map 是一个典型的基于哈希表的键值对存储容器。它提供了 O(1) 的查找、插入和删除性能。

2.1 基本数据结构设计

我们将从头开始设计一个简单的哈希表类，并封装其基础操作。

cppCopy Code
#include <iostream>
#include <vector>
#include <list>
#include <stdexcept>

template <typename Key, typename Value>
class MyHashMap {
private:
    static const int TABLE_SIZE = 100;  // 哈希表大小
    std::vector<std::list<std::pair<Key, Value>>> table;  // 使用链表处理冲突

public:
    MyHashMap() {
        table.resize(TABLE_SIZE);
    }

    // 哈希函数
    int hash(const Key& key) const {
        return std::hash<Key>{}(key) % TABLE_SIZE;
    }

    // 插入元素
    void insert(const Key& key, const Value& value) {
        int index = hash(key);
        for (auto& pair : table[index]) {
            if (pair.first == key) {
                pair.second = value;  // 如果键已经存在，更新值
                return;
            }
        }
        table[index].push_back(std::make_pair(key, value));  // 插入新的键值对
    }

    // 查找元素
    bool get(const Key& key, Value& value) const {
        int index = hash(key);
        for (const auto& pair : table[index]) {
            if (pair.first == key) {
                value = pair.second;
                return true;
            }
        }
        return false;  // 未找到
    }

    // 删除元素
    void remove(const Key& key) {
        int index = hash(key);
        auto& list = table[index];
        for (auto it = list.begin(); it != list.end(); ++it) {
            if (it->first == key) {
                list.erase(it);
                return;
            }
        }
    }

    // 打印哈希表（调试用）
    void print() const {
        for (int i = 0; i < TABLE_SIZE; ++i) {
            if (!table[i].empty()) {
                std::cout << "Index " << i << ": ";
                for (const auto& pair : table[i]) {
                    std::cout << "{" << pair.first << ", " << pair.second << "} ";
                }
                std::cout << std::endl;
            }
        }
    }
};

2.2 插入操作的演示

我们使用 MyHashMap 类进行插入操作演示。假设我们要存储一些简单的键值对，例如人员的名字和他们的年龄。

cppCopy Code
int main() {
    MyHashMap<std::string, int> hashMap;

    // 插入数据
    hashMap.insert("Alice", 30);
    hashMap.insert("Bob", 25);
    hashMap.insert("Charlie", 35);

    // 打印哈希表
    hashMap.print();

    // 查找元素
    int age;
    if (hashMap.get("Alice", age)) {
        std::cout << "Alice's age: " << age << std::endl;
    } else {
        std::cout << "Alice not found!" << std::endl;
    }

    // 删除元素
    hashMap.remove("Bob");
    hashMap.print();

    return 0;
}

2.3 使用场景

unordered_map 适用于需要高效查找、插入和删除的场景。常见的应用场景包括：

缓存系统：通过键值对存储缓存数据，使用快速查找来减少计算量。
计数器：用于统计元素出现的次数，比如统计单词频率。
映射数据存储：用于存储需要映射关系的数据，如学生和成绩的映射关系。

3. 封装哈希表实现 `unordered_set`

与 unordered_map 类似，unordered_set 是基于哈希表实现的集合容器，它只存储唯一的元素，不存储键值对。

3.1 设计 `MyHashSet` 类

与 unordered_map 类似，unordered_set 也采用了哈希表来实现集合的功能。我们将实现一个简单的 MyHashSet 类。

cppCopy Code
#include <iostream>
#include <vector>
#include <list>

template <typename Key>
class MyHashSet {
private:
    static const int TABLE_SIZE = 100;
    std::vector<std::list<Key>> table;

public:
    MyHashSet() {
        table.resize(TABLE_SIZE);
    }

    int hash(const Key& key) const {
        return std::hash<Key>{}(key) % TABLE_SIZE;
    }

    void insert(const Key& key) {
        int index = hash(key);
        for (const auto& item : table[index]) {
            if (item == key) {
                return;  // 元素已存在
            }
        }
        table[index].push_back(key);
    }

    bool contains(const Key& key) const {
        int index = hash(key);
        for (const auto& item : table[index]) {
            if (item == key) {
                return true;
            }
        }
        return false;
    }

    void remove(const Key& key) {
        int index = hash(key);
        auto& list = table[index];
        for (auto it = list.begin(); it != list.end(); ++it) {
            if (*it == key) {
                list.erase(it);
                return;
            }
        }
    }

    void print() const {
        for (int i = 0; i < TABLE_SIZE; ++i) {
            if (!table[i].empty()) {
                std::cout << "Index " << i << ": ";
                for (const auto& item : table[i]) {
                    std::cout << item << " ";
                }
                std::cout << std::endl;
            }
        }
    }
};

3.2 使用示例

cppCopy Code
int main() {
    MyHashSet<std::string> hashSet;

    // 插入数据
    hashSet.insert("apple");
    hashSet.insert("banana");
    hashSet.insert("cherry");

    // 打印哈希集合
    hashSet.print();

    // 查找元素
    if (hashSet.contains("banana")) {
        std::cout << "banana exists in the set." << std::endl;
    }

    // 删除元素
    hashSet.remove("apple");
    hashSet.print();

    return 0;
}

3.3 使用场景

unordered_set 适用于存储不重复元素的场景，例如：

集合运算：计算两个集合的交集、并集或差集。
去重：去除一个集合中的重复元素。
查找操作：用于快速查找某个元素是否存在。

4. 性能分析

4.1 时间复杂度

哈希表的基本操作（查找、插入、删除）的平均时间复杂度是 O(1)，但在最坏情况下（哈希冲突过多），这些操作的时间复杂度可能退化为 O(n)。

4.2 空间复杂度

哈希表的空间复杂度通常是 O(n)，其中 n 是哈希表中元素的数量。空间复杂度也受到哈希表容量的影响，通常我们会设置一个负载因子（load factor）来调整哈希表的容量。

5. 总结

本文介绍了如何封装实现一个简单的哈希表，并基于此实现了 unordered_map 和 unordered_set。我们通过代码示例展示了哈希表的基本操作，并分析了它们的性能。在实际开发中，哈希表是非常高效的数据结构，广泛应用于各种需要快速查找、插入和删除的场景。

希望通过本文的学习，您能更好地理解哈希表的实现原理和应用。

封装哈希表实现 unordered_map 和 unordered_set