Rearranging comments in Drupal

Moving comments around in Drupal is not (yet) straightforward. There is an experimental module that aims at providing this type of funcitonality, but the maintainers strongly discourage you in using it on live sites. If you're a daredevil, just go ahead and play with the comments table. You might ruin your comment structure though… unless you want to understand how Drupal manages comments. After reading this post you should be able to move comments without fear.

Comments under the hood

Drupal stores comments to nodes in the comments table. In Drupal 6, this table contains the following fields:

FieldTypeDescription
cid integer

The comment identifier. This is the unique identifier for a comment.

pid integer

The parent comment identifier. If the comment is a reply to a 'parent' comment, this value is set to thee comment identifier of the parent comment. Otherwise it is set to 0, which means that the comment is a direct comment on the node.

nid integer

The node identifier. This is the reference to the node to which this comment belongs to. The node identifier (nid) is the unique identifier for a given content node in Drupal. It is also an integer.

uid integer

The user identifier. This is the reference to the Drupal user that posted the comment. The user identifier (uid) is the unique identifier tor a given registered user in Drupal. It is also an integer.

Anonymous users are characterized by uid = 0.

subject text

The subject of the comment.

comment text

The actual comment.

hostname text

The IP address of the author of the comment.

timestamp integer

The UNIX timestamp of the comment.

status integer

The status of this comment (it is 0 if the comment is published).

format integer

The input format of this comment. The format identifier maps to the value column of the filter_formats table.

thread text

This field stores the position of the comment in a thread. This information is represented as a directory structure like text string, such as: "01/", "01.00/", "02.01.00/". How to read this?

  • "01/" means the 1st comment to the node
  • "01.00/" means the 1st reply (00) to the 1st comment (01) to the node
  • "02.01.00/" means the 1st reply (00) to the 2nd comment (01) to the 2nd comment (02) to the node

The reasoning behind this format is given in the Drupal API documentation. Basically, you must remember this:

  • The sequence number (i.e., the order) of a comment in a thread is given by the last digit group:
    • "nn/" starting at nn = 1 for direct replies to the node; and
    • ".nn/" starting at nn = 0 for replies to comments.
  • Subcomments of a given parent comment share the parent comment's thread identifier as prefix, where the slash is replaced with a period. Hence:
    • If the parent comment has the following thread identifier: "a.b.c.d/"
    • Then all child comments of this comment have a thread identifier starting with "a.b.c.d." (note the period instead of the slash)
name text

Name of the author of the comment.

mail text

Email address of the author of the comment.

homepage text

Home page of the author of the comment.

Threads

This probably is the most complicated thing relating to comments in Drupal. Basically, a thread represents a conversation, where information about replies is preserved. As I explained in the table above, Drupal keeps information about which thread a comment belongs to, in the the thread comumn of the comments table.

If Drupal only maintained the cid and pid fields, we would still be able to order comments in threads, but the order of the replies at a certain hierarchical level would be lost. You could argue that the post date (the timestamp field) should suffice to address this. True, if we can guarantee that the post date does not get overwritten by an update. However, there is only one timestamp field preserved in the comments table. Whenever somebody edits the comment, the timestamp will change as well. This means that we cannot rely on the timestamp field for ordering the comment replies at a certain depth level in a thread: comments would jump around because of edits to them.

The comments module in Drupal core eventually opted for a solution without an updated timestamp field. There probably are several reasons for this decision. One reason is performance of database queries for reconstructing the thread structure. Another is that the directory-like scheme is able to avoid recursion. Indeed, the position in the thread is easily inferred from the dotted decimal identifier, but in addition this dotted decimal identifier can be used as a search pattern in queries!

The numbering scheme used starts counting from 1 for top-level comments (i.e., direct replies to the node), and from 0 for replies to existing comments. For a given node, an example comment tree might look like:

Comment hierarchycidpiddepth
Drupal thread identifier
Comment
Comment 1     1

0 (n/a)

1 1/ Comment 1 is the first comment to the node. Hence it has no parent comment identifier (n/a). In the Drupal database this is represented as pid = 0.
  Comment 2   2 1 2 1.0/ Comment 2 is the first reply (hence: ".0/") to comment 1. Its parent comment is comment 1, hence pid = 1.
  Comment 5   5 1 2 1.1/ Comment 5 is the second reply (hence: ".1/") to comment 1. Like comment 2, its parent comment is also comment 1. However, it comes after comment 1 in the thread. This is expressed in the thread identifier, where at level 2, we see 1 instead of 0.
    Comment 16 16 5 3 1.1.0/ Comment 16 is the first reply (hence: ".0/") to comment 5, which is the second reply to comment 1. Hence comment 16 has comment 5 as parent comment.
Comment 3     3

0 (n/a)

1 2/ Comment 3 is the second top level comment to the node.
  Comment 7   7 3 2 2.0/
    Comment 15 15 7 3 2.0.0/
  Comment 14   14 3 2 2.1/ Comment 14 is the second reply (hence: ".1/") to comment 3, hence its parent comment is 3, but the thread identifier at level 2 is 1 instead of 0 for comment 7.
Comment 4     4

0 (n/a)

1 3/
  Comment 6   6 4 2 3.0/
    Comment 9 9 6 3 3.0.0/
  Comment 8   8 4 2 3.1/
  Comment 10   10 4 2 3.2/  Comment 10 is the third reply (hence: ".2/") to comment 4 (hence pid = 4).
    Comment 11 11 10 3 3.2.0/
    Comment 13 13 10 3 3.2.1/ Comment 13 is the second reply (hence: ".1/") to comment 10 (hence pid = 10), which in its turn is the third reply to comment 4.
  Comment 17   17 4 2 3.3/

 Comment 17 is the fourth reply (hence: ".3/") to comment 4 (hence pid = 4).

Comment 12     12

0 (n/a)

1 4/ Comment 12 is the fourth direct comment to the node.

Moving a comment in a thread

To move a comment in the thread, we must execute the following steps:

  1. Identify the new pid of the comment that we want to move. If the thread moves at top level, set pid = 0, otherwise set pid to the new parent's cid.
  2. Update the thread identifier to define the new position of the moved comment in the thread. Thread identifiers that are affected, are at the same level of the place where the comment will be moved, and below. This means that we need to get the thread identifier of the moved comment's new parent node, and replace the trailing slash with a period: every comment whose thread identifier starts with this sequence falls in the scope of the next steps:
    1. Save the original thread identifier of the comment we want to move.
    2. The thread identifier of all comments (at the same level) before the moved comment remain unchanged
    3. The moved comment takes the next thread identifier (at the same level)
    4. Its child comments inherit the new thread identifier from the parent comment. This means that the moved comment's original thread identifier is a prefix that has to be replaced with the moved comment's new thread identifier. Everything after the original thread identifier remains the same (we aren't changing the order of the comments below the comment we want to move).
    5. If other comments exist at the same level, their thread identifier must also be updated, as well as its child comments (same procedure as in previous step).

A very simple example

Imagine comment 15 should have been a reply at the node level. To preserve the timeline, we want to move comment 15 below comment 12. Stepwise, we need to proceed as follows:

  1. The new pid of comment 15 will be 0, since we want to move it at top level.
  2. Updating the thread identifier for comment 15 will only affect comment 15: there are no comments under comment 12, and comment 15 has no child comments.
    1. The original thread identifier for comment 15 is "2.0.0/"
    2. As already said, in this example only comment 15 will have a changed thread identifier.
    3. The new thread identifier for comment 15 will be 1 after comment 12 at level 1, hence 1 more at level 1 than "4/", hence "5/"
    4. Comment 15 has no child comments, hence we can skip this step.
    5. There are no more comments at level 1 after comment 15, hence we are done.

Summarizing, in this example we only needed to make 2 edits:

  • Change the pid of comment 15 from 7 to 0.
  • Change the thread identifier of comment 15 from "2.0.0/" to "5/".

A more elaborate example

Imagine we want to move comment 12 between comments 11 and 13.  Stepwise, we need to proceed as follows:

  1. The new pid of comment 12 will be 10, since comments 11 and 13 have comment 10 as parent. Hence we set the new pid of comment 12 to 10.
  2. Now we must update the thread identifier of comment 12 (and its descendants), hence all comments that have a thread identifier starting with "3.2." (note the period at the end of this text string). The original thread identifier is
    1. From the table we read that the original thread identifier of comment 12 is "4/".
    2. All comments up to and including comment 11 remain unchanged.
    3. Comment 12 will take the place of comment 13, hence we need to update the thread identifiers from here onward (at the level where comment 12 will sit). In our case, this boils down to comments 12 (moved) and 13. The new thread identifier for comment 12 is the current thread identifier of comment 13: "3.2.1/".
    4. Since comment 12 has no children, nothing needs to be done here.
    5. Remember we move comment 2 to level 3. Hence we only need to consider comment 13 now, not comment 17 (it's at level 2, so it does not see the 3rd level we are editing in the move). Comment 13 will now receive the next thread identifier at its level (level 3). Hence it goes from "3.2.1/" to "3.2.2/".

Summarizing, in this example we only needed to make 3 edits:

  • Change the pid of comment 12 from 0 to 10.
  • Change the thread identifier of comment 12 from "4/" to "3.2.1/".
  • Change the thread identifier of comment 13 from "3.2.1/" to "3.2.2/".

A more elaborate example

Imagine we want to make comment 10 (and its children) the first child of comment 7. In other words, we want comment 10 to appear below comment 7 but above comment 15, at the same level as comment 15. The steps to follow are:

  1. The new pid of comment 10 is 7.
  2. The scope of thread identifier change is everything below comment 7, hence all comment identifiers that start with comment 7's thread identifier(minus the slash) are affected. The matching prefix is "2.0." (note the period).
    1. The original thread identifier of comment 10 we want to move, is "3.2/". The children of comment 10 have a thread identifier starting with "3.2." (note the period instead of the slash). This is easy to read from the table above.
    2. Since comment 10 will be the first child of comment 7, skip this step.
    3. The first identifier at this level, is "2.0.0/", and this will be the new thread identifier for comment 10.
    4. Now we must move the children of comment 10. This is easy: we only need to replace the thread identifier. More precisely, we must replace the prefix "3.2." with "2.0.0.". Hence comment 11 will have "2.0.0.0/" as thread identifier, and likewise comment 13 will have "2.0.0.1/" as new thread identifier.
    5. The only comment that remains to be processed now, is comment 14. It receives the the next thread identifier after the previous one that was assigned to comment 10. Hence, "2.0.1/".

Summarizing, in this example the following 5 edits have to be made:

  • Change the pid of comment 10 from 4 to 7.
  • Change the thread identifier of comment 10 from "3.2/" to "2.0.0/".
  • Change the thread identifier of comment 11 from "3.2.0/" to "2.0.0.0/".
  • Change the thread identifier of comment 13 from "3.2.1/" to "2.0.0.1/".
  • Change the thread identifier of comment 14 from "2.0.0/" to "2.0.1/".

For sake of completeness

I deliberately kept things a bit too simplified:

  1. In principle we also need to rebuild the thread identifiers in the comment branch where the comment originated from. However, leaving 'holes' in the identifiers (e.g. going from "2.0.0/" to "2.0.2/") will not affect the order, and no information will be lost. Hence it is not really necessary to clean up the 'pruned' branch.
  2. In reality Drupal constructs the thread identifier with two-digit sequence numbers at each level. Hence the thread identifier example "3.2.0/" would in fact be stored as "03.02.00/".

Deleting comments

From the procedure given above it is clear that the simplest way of deleting a comment will also have to delete its children. Otherwise we will end up with orphaned comments.

If we want to preserve child comments from a comment we want to delete, then we must first assign a new parent comment to the soon-to-be-orphaned comments. This basically boils down to moving all children of the comment we want to delete. This has already been described above.

Moving comments to another node

In this scenario, we must change the nid of the comment we want to move, as well as for all child comments of this comment. Then the regular comment move steps must be performed, as described above.

iN8sWoRld.net's picture

Thanks

Thanks so much for this post! I used to use the comment mover module on one of my Drupal sites back on 4.7, but the new maintainer's warnings scared me off of using it on 6.20 (so far ;) This seems like such a basic administrative function (being able to move a comment that was posted in the wrong thread to a more appropriate location) that I'm surprised its not in core! Using your instructions I was able to successfully move a comment to the right place by making the changes directly in the table (using phpmyadmin) and I have a better understanding of the structure of the comment table, thanks!

MhueD's picture

adapting this technique to moving comments between sites

Am I right in assuming that the main challenge in making this cross-site is whether one can get the cid/pid key joins to not interfere with the progress of the cid field in the target database because it is auto-incremented? I suppose there is also a problem with uid, because even dupe users have different ids unless the two sites are sharing a user table. In my case, all I want to do is move a node -- WITH ITS COMMENTS -- to a subdomain of our main site, to gather better SEO 'love', and I am having one heck of a time. Some of our comment threads are ginormous and manual copying, even the db-centric form you are suggesting, seems pretty time-consuming. Definitely an opportunity for big-time module project here, though the node-export people should take it as a feature request (they don't seem interested yet, based on http://drupal.org/node/889204.)

Olivier Biot's picture

Re: adapting this technique to moving comments between sites

If you want to move an entire node and its comments across Drupal installations, then you'll have to address at least the following matters:

  1. The poster's user identifier might be different or even nonexistent across the Drupal installations.
    • There are probably a couple tricks you could use, but they'll require you writing the intelligence for looking up and creating suitable users in the target Drupal installation.
    • Simplest extremes are: users are the same (e.g., through openId or b sharing the same user database across Drupal sites), and "make all users anonymous in the target Drupal".
  2. The node identifier will most lilkely be different, as source Drupal and target Drupal might have a different number of nodes.
    • You'll have to use the Drupal node creation API and store the target nid when adding the comments to the node.
  3. The comment identifier will most lilkely be different, as source Drupal and target Drupal might have a different number of comments.
    • Each comment you want to move will receive a new cid value on the target Drupal installation. You can walk the comment tree from the top downwards, while keeping track of new target comment IDs and target parent comment IDs while creating new comments one at a time in the target Drupal installation.

There are probably some other things that should be done as well, such as checking the input filters that apply to nodes and comments, creation of CCK node types across sites if the content is not a "regular" node.

eddie taylor's picture

nice interesting and very

nice interesting and very informative post, thanks for sharing.

saurabh's picture

hello

So far I visit your blog and found very informative post and your sharing is such a useful for me and other really thanks for sharing such a great post.

Dan Knowlson's picture

Thank you

Using your advice I was able to move comments from a populer product on my Ubercart site from one content type to another

Ardarvin's picture

Don't forget to modify node_comment_statistics (d7 maybe d6?)

In Drupal 7, if you are moving comments to another node, don't forget to alter the node's node_comment_statistics table. Fields are self explanatory. Don't really need timestamp, but count, uid, and cid are important.

Nemo's picture

nice information

nice information

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account associated with the e-mail address you provide, it will be used to display your avatar.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img>
  • Lines and paragraphs break automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.