Refresh

This website coolshell.cn/articles/925.html is currently offline. Cloudflare's Always Online™ shows a snapshot of this web page from the Internet Archive's Wayback Machine. To check for the live version, click Refresh.

如何比较两个数据表

如何比较两个数据表

有些时候,我们可能想要比较一下两个数据表,以找到其中不同的数据。比如,在进行数据移植的时候,或是在合并数据的时候,或是在比对验证数据的时候。当然比较两个表,需要这两个表结构是一样的。

我们先假设一下有如下表结构:

CREATE TABLE jajal
(
    user_id integer NOT NULL,
    first_name character varying(255),
    last_name character varying(255),
    grade character(1),
    CONSTRAINT jajal_pkey PRIMARY KEY (user_id)
)


然后,我们有两张表——jajal和jajal_copy,其内容如下:

 jajal

user_id first_name last_name grade
1 Some Dude A
2 Other Guy B
3 You are Welcome B
4 What Other A
5 INeed You C
6 Mixed Nuts Z
7 Kirk Land B
8 Bit Shooter A
9 Sun Microsystem C
10 Extra Fancy B

jajal_copy

user_id first_name last_name grade
1 Some Dude A
2 Other Guy B
3 You are Welcome B
4 What Other A
5 INeed You C
6 Mixed Nuts C
7 Kirk Land B
8 Bit Shooter A
9 Sun Microsystem C
10 Extra Fancy B

 

要比较这两张表的数据,找出不一样的数据行。我们可以使用outer join 技术。我给outer join做了一个链接,是Wikipedia的,如果你对这个技术不是很清楚,还请你行看看其技术细节。

下面是具体的SQL语句:

使用FULL OUTER JOIN

SELECT
     *
FROM
     jajal j
     FULL OUTER JOIN jajal_copy jc ON jc.first_name = j.first_name
     AND jc.last_name = j.last_name
     AND jc.grade = j.grade
     AND jc.user_id = j.user_id
WHERE
     j.user_id IS NULL
     OR jc.user_id IS NULL

运行结果如下:

user_id first_name last_name grade user_id first_name last_name grade
[NULL] [NULL] [NULL] [NULL] 6 Mixed Nuts C
6 Mixed Nuts Z [NULL] [NULL] [NULL] [NULL]

 

使用NATURAL FULL OUTER JOIN

关于natural join,你可以看看Wikipedia是怎么说的。

SELECT
       *
FROM
       jajal j
       NATURAL FULL OUTER JOIN jajal_copy jc
WHERE
       j.user_id IS NULL
       OR jc.user_id IS NULL

运行结果如下:

user_id first_name last_name grade
6 Mixed Nuts C
6 Mixed Nuts Z

 

MySQL SQL 代码

MySQL 并不支持 FULL OUTER JOIN,但是我们可以使用LEFT JOIN 和 RIGHT JOIN 来实现这一功能。如下所示。
SELECT
*
FROM
jajal j
LEFT JOIN jajal_copy jc ON jc.first_name = j.first_name
AND jc.last_name = j.last_name
AND jc.grade = j.grade
AND jc.user_id = j.user_id
WHERE
jc.user_id IS NULL
UNION ALL
SELECT
*
FROM
jajal j
RIGHT JOIN jajal_copy jc ON jc.first_name = j.first_name
AND jc.last_name = j.last_name
AND jc.grade = j.grade
AND jc.user_id = j.user_id
WHERE
j.user_id IS NULL

或者你更喜欢NATURAL JOIN 版本

SELECT
*
FROM
jajal j
NATURAL LEFT JOIN jajal_copy jc
WHERE
jc.user_id IS NULL
UNION ALL
SELECT
*
FROM
jajal j
NATURAL RIGHT JOIN jajal_copy jc
WHERE
j.user_id IS NULL

当然,如果你需要一个MySQL的存储过程的话,下面是一个示例:

DELIMITER $$

CREATE PROCEDURE `db_schema`.`tablediff`
    (schema_name VARCHAR(64), table1 VARCHAR(64), table2 VARCHAR(64))
BEGIN
    DECLARE done INT DEFAULT 0;
    DECLARE sql_statement TEXT DEFAULT '';
    DECLARE sql_statement_where TEXT DEFAULT '';
    DECLARE sql_statement_pk TEXT DEFAULT '';
    DECLARE col_name VARCHAR(64);
    DECLARE col_name_cur CURSOR FOR
        SELECT
            COLUMN_NAME
        FROM
            information_schema.COLUMNS
        WHERE
            TABLE_SCHEMA = schema_name
            AND TABLE_NAME = table1
    ;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    OPEN col_name_cur;
    traverse_columns: LOOP
        FETCH col_name_cur INTO col_name;

        IF done THEN
            CLOSE col_name_cur;
            LEAVE traverse_columns;
        END IF;

        SET sql_statement_where = CONCAT(sql_statement_where,
            ' AND a.', col_name, ' = b.', col_name);
        SET sql_statement_pk = CONCAT(sql_statement_pk,
            'AND b.', col_name, ' IS NULL');
    END LOOP;

    SELECT
        COLUMN_NAME INTO col_name
    FROM
        information_schema.KEY_COLUMN_USAGE
    WHERE
        CONSTRAINT_SCHEMA = schema_name
        AND CONSTRAINT_NAME = 'PRIMARY'
        AND TABLE_NAME = table1
    LIMIT 1
    ;
    IF col_name IS NOT NULL THEN
        SET sql_statement_pk = CONCAT('AND b.', col_name, ' IS NULL');
    END IF;

    SET sql_statement = CONCAT('SELECT * FROM ', schema_name, '.', table1, ' a LEFT JOIN ', schema_name, '.', table2, ' b ON TRUE');
    SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
    SET sql_statement = CONCAT(sql_statement, ' UNION ALL SELECT * FROM ', schema_name, '.', table1, ' b RIGHT JOIN ', schema_name, '.', table2, ' a ON TRUE');
    SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);

    SET @s = sql_statement;
    PREPARE stmt1 FROM @s;
    EXECUTE stmt1;
    DEALLOCATE PREPARE stmt1;

END$$
DELIMITER ;

 

 

PostgreSQL 下的SQL语句

下面是PostgreSQL的一个存储过程:

CREATE OR REPLACE FUNCTION tablediff (
    IN schema_name VARCHAR(64),
    IN table1 VARCHAR(64),
    IN table2 VARCHAR(64)
) RETURNS BIGINT AS
$BODY$
DECLARE
    the_result BIGINT DEFAULT 0;
    sql_statement TEXT DEFAULT '';
    sql_statement_where TEXT DEFAULT '';
    sql_statement_pk TEXT DEFAULT '';
    col_name VARCHAR(64);
    col_name_cur CURSOR FOR
        SELECT
            column_name
        FROM
            information_schema.columns
        WHERE
            table_catalog = schema_name
            AND table_schema = 'public'
            AND table_name = table1
    ;
BEGIN
    OPEN col_name_cur;

    LOOP
        FETCH col_name_cur INTO col_name;
        IF NOT FOUND THEN
            EXIT;
        END IF;

        sql_statement_where := sql_statement_where || ' AND a.' || col_name || ' = b.' || col_name;
    END LOOP;

    SELECT
        column_name INTO col_name
    FROM
        information_schema.table_constraints tc
        JOIN information_schema.constraint_column_usage ccu ON
            ccu.constraint_name = tc.constraint_name
    WHERE
        tc.table_catalog = schema_name
        AND tc.table_schema = 'public'
        AND tc.table_name = table1
    LIMIT 1
    ;

    IF col_name IS NOT NULL THEN
        sql_statement_pk := ' a.' || col_name || ' IS NULL';
        sql_statement_pk := sql_statement_pk || ' OR b.' || col_name || ' IS NULL';
    END IF;

    sql_statement := 'SELECT COUNT(*) FROM ' || schema_name || '.public.' || table1 || ' a FULL OUTER JOIN ' || schema_name || '.public.' || table2 || ' b ON TRUE';
    sql_statement := sql_statement || sql_statement_where || ' WHERE ' || sql_statement_pk;

    EXECUTE sql_statement INTO the_result;

    RETURN the_result;
END;$BODY$
LANGUAGE 'plpgsql' STABLE;

 

文章:来源

(转载本站文章请注明作者和出处 酷 壳 – CoolShell ,请勿用于任何商业用途)

好烂啊有点差凑合看看还不错很精彩 (7 人打了分,平均分: 3.29 )
Loading...

如何比较两个数据表》的相关评论

  1. 想到一个对比两个表数据的方法,
    mysqldump tableA > tableA.sql
    mysqldump tableB > tableB.sql
    sed -i ’s/tableB/tableA/g’ tableB.sql
    diff tableA.sql tableB.sql
    但如果表数据里有tableB会不成功

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注