Importing MT-like data problem
-
I’m working on importing stuff from my html pages into WordPress via the MT import script and I seriously need help at the forum discussion post here.
If anyone has any experience with importing into the WP database tables or converting data to the MT import layout for importing through WP, PLEASE help me.
Thanks,
Lorelle
-
Ok,
1. You can’t import HTML pages into a database as records. It’s not a field delimited format. Really. It isn’t.
2. Don’t try and munge it into MT format first. Why? Because it’s not necessary.
3. Convert the data into WP compatable SQL statements.
3a. How do you find out what WP compatable SQL statements are? Easy. Export your wp_posts table and look at how it’s formatted. (Don’t use ‘extended inserts’ format, it’s less readable.)
4. Then import those SQL statements into the database via phpmyadmin or the mysql command line.
5. Enjoy.
Thanks for the reply, Kitten.
I have HTML data in the fields, not whole HTML to be imported.
I’ve exported the wp-posts table and examined it thoroughly. Is there a way to get around the specific order of the table or can I put in any “hints” that says “this is this so ignore the order?” Make sense?
I can go through and change the order of the import file, but we’re talking tedious stuff. Munging the stuff into MT format seemed best because the import script didn’t seem too fussy over the order of the information as long as the “title” was there like “TITLE: blah blah” and BODY: “blah blah blah”.
I’m been at this for three weeks, so any help deserves hugs and flowers.
Databases aren’t mind readers, you have to tell them what is where. That’s done in the insert statements.
INSERT INTO wp_posts ( foo, bar, baz, etc...) VALUES ( one, two, three, etc... );
So that
foo
isone
,bar
istwo
, andbaz
isthree
.Now you just have to have valid WP field names in the first set of
( )
(the ‘where’) and the post info in the second set, (the ‘what’). As long as the what is mapped to the where, and the where exists in your table, it’ll import and be where you want it.Brilliant. I forgot I can be specific within the INSERT…after weeks of this, the mind is fried.
Thanks,
Lorelle
Ah, now I remember why this didn’t work before. Remember, I have HTML data with quotes around attributes. Using INSERT, I am limited to using field separators of “,” and a line break as the end of the record.
Using LOAD DATA INFILE I can establish distinctive separators with the FIELDS TERMINATED BY… etc.
Any way of combining the two so I have something that says INSERT X, Y, Z… using FIELDS TERMINATED BY…. INTO wp_posts….
EDIT: I guess I can set it up to be concurrent with every field in there with nothing in it in order to use the LOAD, but…is there a combo?
>Ah, now I remember why this didn’t work before.
>Remember, I have HTML data with quotes around attributes.Why isn’t your data properly escaped? Then it’d be properly inserted into the database.
Make a post in WP with some HTML in it, dump it & look at it. Then you’ll see what the format should be.
>>>Why isn’t your data properly escaped?<<<
I’m not sure what you mean by escaped. And I guess I’m not explaining myself. I did a dump from the wp-table and examined it. I can do a search and replace to insert all the appropriate separators.
The problem is that the data isn’t in the right order. So I can’t do an INSERT and am stuck doing a LOAD. LOAD will not allow me to be specific with the field names like INSERT does, so I would have to manually go through the data and change what goes where.
I can manually go through the data and change the order of the information, but we are talking over 500 articles that aren’t short blogs. I can also blow out all the title, author name, excerpts, etc., and just import the data straight in, and then I will have to go through every thing in the database to add in the other information. More manual labor.
If it weren’t for the quotes around the html attributes, I could very easily import this into excel or something and then realign the order. I might be able to do this in WordPerfect by creating a merge file and then merging it into a table and then realigning the table and getting it back into a format that will work with the INSERT. This is a lot of work, but if it is the only choice, I’ll do it.
I’m trying to work with what I have to avoid the manual labor. If I can’t, then a lot of other people who are trying to do what I’m doing, with a lot more data than I have, need to know this information, too.
Since it never began in a database, I’m trying to format it in a fashion to get it into the database. I really feel like I’m a pioneer in this, but I can’t be. Any and all help is appreciated, Kitten.
I think what Kitten means by ‘escaped’ is inserting backslashes before the quotes; you could do that with find-replace in any decent text-editor.
Thanks. If it is that simple, then this should be really simple to import. I hope it is as simple as that.
Non-escaped data:
this is "something" that screws up my "importing"
Escaped data:
this is \"something\" that screws up my \"importing\"
the second you can stick between double quotes and import it just fine.
I’m working on a write up for the codex on all of this, so I need to have some clarity.
The proper form for using the INSERT with the specific fields listed with the \” acting like an escape for all the quote marks would look like this example:
INSERT INTO wp-posts (post_author, post_date, post_content, post_title, post_excerpt)
VALUES ("1", "2005-1-14", "<p class="red">Something in \"red\" here.<p id=\"fred\">blah blah</p>","Post Title", "This is the excerpt blah blah")
Am I even on the right track?
Close, but your 3rd field would end right before the letter ‘r’ (after the ‘=’) since that quote is not escaped.
More like:
INSERT INTO wp-posts (post_author, post_date, post_content, post_title, post_excerpt)
VALUES ("1", "2005-1-14", "<p class=\"red\">Something in \"red\" here.
<p id=\"fred\">blah blah</p>","Post Title", "This is the excerpt blah blah")Notice that the quotes enclose all the data, anytime a quote appears in the data it needs to be escaped. Here’s a trick to finding out if your data’s correctly formatted:
Import it.
If you get errors, note the line number, then go look at it, fix it. Wash, rinse, repeat, until it imports without error.
I worked on the code to make it appear in the post so many times, gee I wish there was a post preview here. I finally gave up and the slashes around the “red” were culled by the software onboard here.
Thanks for figuring it out. Now to sleep and give it a go in the morning. I’m eternally grateful and ready to give this a fling.
ARGHHH!
It won’t work and it won’t give me a specific error. Here is part of the code, if the forum’s software won’t rewrite some of it.
INSERT INTO wp-posts ("post_excerpt","post_title","post_author","post_date","post_category","post_content","post_status")
VALUES ("Internet Tips - Popup Spam Ads Spyware Gator Hotbar - how to get rid of, eliminate, and kill these nuisance programs on your computer.","Popups, Spammers, Spyware, Gator, GAIN, and Adware - Fighting Back","Lorelle VanFossen","11/25/2004 03:31:05 PM","Learn","<p>Determined to fight back against those who abuse the benefits of the Internet and the Web? As <a title=\"information and articles on nature photography, traveling, and writing\" href=\"../../about.html\">nature photographers and writers</a> traveling and living in a computer world, .........)And goes on and on. If it told me that it imported X files, or to X line, I’d have a point to refer to. But the error says SQL error, please refer to the online documentation…blah blah. Nothing specific.
I escaped the quote marks thoroughly. This is only a three “post” test run. So it isn’t too big. The text file is like 65K.
Ideas?
I really need some help with this. I’m on a plane in less than 24 hours and will be gone for almost a month. With these uploaded, I can work on them from the road.
- The topic ‘Importing MT-like data problem’ is closed to new replies.