WPwatercooler

EP475 – The Great Escape: WordPress Data Liberation Project

February 9, 2024

On this episode of WPwatercooler, titled “The Great Escape: WordPress Data Liberation Project,” Jason Tucker and Jason Cosper discuss the significance of data portability in WordPress. They delve into the recent push towards data liberation, enabling content to move freely between different Content Management Systems (CMS), including the import and export of data from WordPress to other platforms and vice versa. The conversation touches on the challenges and technical considerations involved in migrating data, the impact of block editors, and the importance of making data migration user-friendly to support the growth and flexibility of WordPress as a platform. The episode explores the broader implications of data portability for users and the ecosystem, highlighting the need for more intuitive tools to facilitate data movement without compromising content integrity.

Links

Chapters

  • 00:00 Introduction
  • 01:41 Discussion on Data Liberation in WordPress
  • 03:33 The Importance of Data Portability
  • 10:13 Challenges with Block Editors and Data Migration
  • 15:20 Exploring Alternatives and Future of Data Formats
  • 20:59 User Experiences and Expectations on Data Migration
  • 25:46 Technical Aspects and Solutions for Data Export/Import
  • 30:00 The Role of Open Source in Data Portability
  • 35:27 Final Thoughts on WordPress and Data Liberation

What is WPwatercooler? WPwatercooler is streamed live and recorded as the self-titled show on the WPwatercooler Network. Our objective with the show since the beginning has been to help people in this industry have a place to hear people, much like themselves, talk about the technologies and methods we all use on a daily basis. We named WPwatercooler to be that, the watercooler that WordPress folks can gather around and participate in the conversation, or just sit back and learn from the discussion. Our listeners and contributors come from all walks of life and all backgrounds. We strive to make this place as welcoming and accessible as we can. Learn more at https://wpwatercooler.com/wpwatercooler

What is Dev Branch? Dev Branch is streamed live and recorded monthly on the first friday of the month as the developer-focused discussions of the WPwatercooler Network. Dev Branch is released on its own podcast feed and made available live and on-demand in video format on Facebook, YouTube, LinkedIn, and Twitch. Learn more at https://wpwatercooler.com/devbranch

Want to create live streams like this? Check out StreamYard: https://streamyard.com/pal/d/5756954563575808

Panel

Episode Transcription

[00:00:00] Jason Tucker: This is episode number 475 of WPwatercooler, The Great Escape, WordPress Data Liberation Project. I’m Jason Tucker, go to my website at jasontucker. blog. You can find all the links to me over there.

[00:00:26] Jason Cosper: Stay Reed is on assignment and holy calamity scream insanity. All you other wannabes and other great fan of me break.

[00:00:41] Jason Tucker: Go and take a look at wherever it is. You can find awesome podcasts, right? Cost per awesome podcasts and. You can go and check out our discord.

[00:00:52] Jason Cosper: it’s the

[00:00:52] Jason Tucker: I’m saying that I’m remembering what you sent me. I was like, ah, screw it. Who cares? Oh man. Yeah. Podcasts. Fun stuff. So it’s Friday. We’re, we’re, we’re pushing the record button.

[00:01:10] Jason Cosper: it, yeah, we’re, we’re down a, we’re down a say read, but that’s okay. Like, uh, yeah, hope, hope everything’s going all right for her. She’s, uh, just, uh, needed, needed a day to, to refresh. And we’re going to be talking about, uh, the data liberation project in WordPress today.

[00:01:30] Jason Tucker: Yeah. Yeah.

[00:01:31] Jason Cosper: So, uh, I, I know that, um, this is something that I proposed, so I guess I’ll just, uh, kind of take it all and, uh,

[00:01:40] Jason Tucker: Please do.

[00:01:41] Jason Cosper: Yeah, um, so this is kind of, uh, a thing that was sort of surprised announced during state of the word, uh, around data liberation. Um, and kind of freeing content from other, uh, CMSs, uh, helping bring it into WordPress.

[00:02:02] Jason Cosper: Uh, historically WordPress has had a number of importers available that, uh, help people pull content in. From, uh, you know, other software like a blogger, um, you know, from other WordPress sites, et cetera. Uh, and one of the things that, uh, our benevolent dictator, um, sort of, uh, you know, handed down from the mountain said, Hey, uh, let’s start bringing in.

[00:02:31] Jason Cosper: Uh, content from other content management systems. Let’s, let’s try to make, uh, the data. Portable for everybody. And while I think that this is a good thing, uh, while I think that, um, you know, this is definitely something that, uh, that we need, uh, as a project to, to kind of help the project, uh, continue to grow.

[00:02:54] Jason Cosper: Uh, and it’s funny to say to help the project continue to grow since we’re at what, like 46 ish. Percent of the web, like

[00:03:05] Jason Tucker: Yeah.

[00:03:05] Jason Cosper: how much more, uh, you know, market share do we need, uh, to help things grow? Like, I think we’re doing okay. Um, but, uh, data portability is, uh, a good thing. Um, not only moving stuff in, uh, from the likes of Wix and Squarespace and all these other, uh, builders, um, but.

[00:03:33] Jason Cosper: Um, being able to move stuff out. Um, I know, uh, Tucker, you, uh, kind of got just a little itch and wanted to try something new. You’ve been working, uh, on micro. blog lately, um, and, and, you know, seem to be fairly active on that and, and playing around with that. Um, did you import your content in from WordPress into micro.

[00:04:01] Jason Cosper: blog or.

[00:04:03] Jason Tucker: Did I did. And, um, I think, I think things would have went really well if I never touched blocks

[00:04:13] Jason Cosper: Oh, okay.

[00:04:15] Jason Tucker: because the second you start going like, Oh, I’m going to make columns and I’m going to make this and I’m going to get all fancy with all my stuff. Um, all that fancy is just a bunch of comments inside the blog post. And, um, Sans fancy. It is the most boring as blog post ever when it gets moved to someplace else.

[00:04:35] Jason Tucker: It doesn’t support blocks and, um, it works still works fine. I just got a whole bunch of like markup that’s sitting in there that isn’t needed. But, um, I also have blog posts from a hundred years ago that have short codes all over it and I just have naked short codes, you know, throughout the entire, uh, uh, thing as well.

[00:04:54] Jason Tucker: So, um, I think just, you know, the evolution of, of using WordPress and going from one technology. Type to another, you tend to, you know, come up with all the extra craft that you’re going to have to deal with when you move from place to place, especially when, you know, those objects are represented by a, um, a short code.

[00:05:13] Jason Cosper: Yes.

[00:05:14] Jason Tucker: inside there is a whole bunch of like very smart data that did not make its way over.

[00:05:19] Jason Cosper: Right. And, and that is kind of, uh, one of the big things that I think, um, is an argument for, um, being able to like this, this data liberation, uh, you know, sure, should we be, uh, liberating data from, uh, from Drupal and blogger and, uh, Tumblr like, yes, absolutely. Like, you know, um, one would argue that, um. You know, Drupal that’s an open platform, like, you know, putting your stuff in Drupal, uh, is not the same as sticking your site into Squarespace or Wix, uh, or even Tumblr or something that is, uh, pretty locked down and obscured behind the scenes, like, um, you know, um, now correct me if I’m wrong, uh, micro.

[00:06:08] Jason Cosper: blog is what Hugo,

[00:06:10] Jason Tucker: Hugo. Yeah, it’s Hugo with a lot of, um, front end or, you know, uh, interface customization that was kind of built to do stuff. So for instance, when you do that import, you’re, um, you’re importing, um, like it goes and reads all the HTML to pull in all the images. And so it will then, you know, source those images from your website and pull them into it.

[00:06:36] Jason Tucker: Which means you have to kind of think about things a little bit before you start doing this stuff. And then, you know, because of the fact that some of these images may be figures, some of them may be IMG tags. Some of them may be who knows what, you know, and you’re just going to have to go back and clean stuff up or you.

[00:06:57] Jason Tucker: Do that before you move it. So there’s a bit of a sanity check that you got to do there. And maybe even some kind of, um, clean up to kind of make it all happen. And I’ve read some folks, you know, that were saying, Oh, you can do it this way. You could, you know, they’re all coming up with their own solutions for things.

[00:07:13] Jason Tucker: But me being the lazy person that I am, I just pushed the button and closed my eyes and said, do I really care about the previous posts? Whatever the words made it over. Maybe the images made it over or not, you know, you have like source set, like image source sets and all sorts of like weird stuff that just didn’t copy over.

[00:07:33] Jason Tucker: And a whole bunch of, if you ever ran a jet pack, any jet pack type stuff in there as well, cause it’s being sourced from another, you know, URL and it’s only pulling stuff from your, you know, your domain, not from any of the, um, external CDNs or anything like that that’s being used. So it gets a little.

[00:07:53] Jason Tucker: Funky, but it worked out fine.

[00:07:56] Jason Cosper: Yeah, but I mean, still that Hugo is another open source, um, content management system. Uh, micro dot blog is just a very nice front end, like put onto Hugo, but still, um, moving from one open source project to another, um, you really shouldn’t lose. This data, you really shouldn’t, um, end up, um, you know, missing out on, you know, you’ve put your, uh, content into WordPress, which is, uh, you know, uh, an, an open source program, uh, you know, it’s not locked down, but high in some obscure.

[00:08:40] Jason Cosper: thing like Squarespace is like Wix is like, uh, you know, any of these other, uh, website builders are, I, I don’t know what GoDaddy is doing, but I’ve heard, um, people talk about like GoDaddy builder, uh, for stuff. And that’s not to like, uh, you know, ding GoDaddy for anything that they’re doing, but, um, you know, you’re, you’re putting your, your stuff into this lock system, you expect, okay, it might be a little harder to get.

[00:09:09] Jason Cosper: Get things out, uh, you’re putting things into WordPress and you feel like you should be able to, to move out. And, uh, you know, if you want to, um, so having, uh, this thing that you’re running into where, uh, you know, maybe, uh, you know, if you decide to use the Jetpack CDN, uh, and stuff doesn’t migrate over, if you decide to, um, you know, use Uh, other components, uh, like, uh, having the short code there, like, um, having the short code in the, in the posts, like when you migrate stuff, like it should, one would imagine,

[00:09:51] Jason Tucker: One would imagine.

[00:09:52] Jason Cosper: Yeah, that it would, uh, expand and grab all that stuff.

[00:09:56] Jason Cosper: Cause the short code is just telling, uh, WordPress, like, Hey, I need this little bit of code, like, so don’t, don’t be daft, like actually put the bit of code that’s needed in there. Uh, put that little rendered bit in there.

[00:10:13] Jason Tucker: Yeah. That makes me, that makes me think like, is, do you feel that, that it’s, it would be a better way to go about this in a, like having that content in a rendered state. Already like fully rendered out, like, uh, you’re practically using, you know, um, some kind of like a browser front end, you know, headless front end to like read it and then pass it over.

[00:10:40] Jason Tucker: And so you’re actually getting the HTML of it versus. You know, I, I don’t even know how to explain this other than just to say these types of things, but just like, instead of pulling it from the database itself or from whatever kind of rendering engines that happened within WordPress to finally get it to display on the page, but rather having a browser emulator, scrape it and then pump it back in, what do you think the difference between those two would be other than processing power?

[00:11:10] Jason Cosper: Yeah. I mean, I mean, look, there are tools out there that will, and I mean,

[00:11:18] Jason Tucker: or something like that. Was it phantom? Isn’t it phantom? The, you know, like one of those types of like, you know, browser emulators?

[00:11:26] Jason Cosper: sure. Sure. I mean, you could do that. You could, um, the thing that sprung to mind for me is, um, those tools that actually make a static version of your website. Um, so like WP. Two number two static, uh, a few of the other plugins out there. Uh, I mean, that is not embedding the shortcode that is giving you like the raw.

[00:11:51] Jason Cosper: Uh, HTML, uh, of it all, um, and, you know, pulling down the things that you actually need, uh, I, um, kind of, and I put this into our little, uh, private chat and we’ll make sure it makes it into, to show notes like this is, this came up and the reason I proposed this topic is because, uh, I I’ve just been thinking about it, like, um, WXR the, the format, uh, WordPress extendable RSS, I think is what it stands for.

[00:12:26] Jason Cosper: Um, and WXR the, the format, um, is, uh, something that they kind of, uh, there, there’ve been a few little minor tweaks to it over the years. Um, but that is like what, uh, WordPress like. Exports, uh, when you ask it to give you an export of your site. Um, but in a lot of cases, uh, like you said, it, it doesn’t, that’s how you imported your stuff over into micro dot blog, right?

[00:12:58] Jason Cosper: Yeah. Um, it gives the short code. It doesn’t give the actual rendered HTML. It gives, uh, some things that where, where you’re like, Hey, we’re. 20 years into this as a project, like the only place I should see a shortcode anymore, well, you shouldn’t be seeing shortcodes at all anymore. It’s all just blocks, but the only place that you should see shortcodes is like in those classic editor posts that you haven’t converted over.

[00:13:28] Jason Tucker: Yeah. It’s like the exporter has a, um, an assumption that the, the, the data that’s going to be ingested is going to be displayed the same way in which it was previously. Like the assumption is, you know, it’ll work when we get to, when we get to the other place, it’ll work where, you know, if you’re moving it from one spot to another, that like I lost, I was just thinking, I’m like, what did I lose when I had these short codes that were in there?

[00:13:56] Jason Tucker: I think at one point, you know, you play around with stuff, right? You get on this whole little, like, Oh yeah, I want to do this thing. You know, I had like a health kick for a little while and I was like, Oh yeah, I want to track all my stuff. And I had like a short code that was like. Displaying these things in it, kind of like a live journal back in the day of like, what song were you listening to when you wrote your blog, when you wrote your blog post, like that sort of thing.

[00:14:18] Jason Tucker: And so I did that and I’m like, yeah, that’s all gone. You know, there’s just a short code that says like, you know, Jason shortcode underscore something, something, cause you know, It’s like, it is what it is. I didn’t care, but that’s what’s sitting in those blog posts now probably is something like that instead of the, you know, the five or six, you know, pieces of metadata that, you know, that I was collecting painstakingly and then, you know, just threw to the wayside as I, um, imported the short code with, with no data in it.

[00:14:50] Jason Cosper: Right. So I, I proposed, um, this thing on, um, on the Fediverse. Like I, I, I made a post on, uh, my mastodon instance about like, Hey, if you were designing, um, the, the WXR format today. Like, what would you have in there? Uh, would it look completely different? Uh, I mean, things have been moving, uh, to JSON for quite some time.

[00:15:20] Jason Cosper: Like I, I don’t, RSS is really the, the big thing holding on to like XML. Um. You know, like what, how would you design this, uh, today? I also, um, another thing that I I’ve looked at is, uh, there’s this format called, uh, text bundle, which effectively is, um, like a, like a markdown file. Sorry, say you’re not here to, to slag markdown.

[00:15:49] Jason Cosper: Um, but. Uh, it’s effectively a Markdown file with, um, all of its, uh, like added components. So say you added, um, you know, like you’re writing a note in Markdown and you added, uh, a couple images and maybe like a PDF. To like, um, like how you wouldn’t like Apple notes or some other note taking app, um, so you like can make reference to it later, uh, instead of giving you just a markdown file and then going like, Oh, let’s pull this image in later.

[00:16:24] Jason Cosper: Let’s pull this other thing. The text bundle would actually give you the media that you attached to that markdown file. Alongside the, the markdown, it seems like a really novel thing, but

[00:16:41] Jason Tucker: does this as well as a couple others. Um, also I was trying to think, I’m like, where did I hear, where did I hear this from? Oh yeah. That’s what it was. It’s it’s all the Markdown editors and all that stuff that we’re just trying to figure out a bundle everything together.

[00:16:55] Jason Cosper: craft.

[00:16:56] Jason Tucker: Yeah.

[00:16:57] Jason Cosper: yeah, a few other, uh, craft, interestingly enough, uh, if you haven’t played with that note taking app, uh, craft uses, um, blocks to like lay your content out, so like, it’s not like this is unheard of.

[00:17:18] Jason Tucker: Right. Right.

[00:17:19] Jason Cosper: to do, um, blocks and, and everything

[00:17:23] Jason Tucker: just imported it to that. That’s what I should have done. Yeah.

[00:17:27] Jason Cosper: interestingly on, uh, textbundle. org, um, it says that one of the apps that supports this is WordPress, uh, through the iOS and iPadOS, um, like apps.

[00:17:43] Jason Cosper: So I would assume like, um, moving stuff in. Like if you shared a note from bear or something like that. Um, so some component of WordPress already understands text bundles, having something, and I’m not saying like, let’s start exporting our posts into Markdown, but, uh, I know, I don’t know a ton of people who actually use, um, the importer and well, okay, they use the importer.

[00:18:18] Jason Cosper: Uh, but I don’t know a lot of people who use the exporter to like migrate their site somewhere else. They end up falling back on a migration plugin. They end up falling back on. Um, something where they have a little more control. They can make sure that like their images that they’ve attached to the posts make their way over.

[00:18:40] Jason Cosper: Um, and normally that means, uh, a zip file that’s a few hundred megs, or if you’ve had a site for long enough, like Tucker and I have had, um, you know, a few gigs, a few dozen gigs, who knows? Um. You know, like that’s going to be one hell of a text bundle, uh, to move around. So like, uh, when I asked on, uh, the Fediverse, like what, what should we do here?

[00:19:10] Jason Cosper: Like what, um, what should a modern. Um, like, you know, import export format look like, uh, and maybe we’re getting a little too dev branch here. We don’t have a say here to like, uh, to, to reign us in and say that we’re getting a little too nerdy with it, but, um,

[00:19:35] Jason Tucker: won’t say anything.

[00:19:36] Jason Cosper: yeah. Okay, cool. Thank you. Um, I really think that, um.

[00:19:43] Jason Cosper: Having something, um, that is a portable, that is, is a bundle like that, having something that, uh, the thing that I initially proposed was like, um, you know, since it’s no longer 2004, like we don’t need, uh, XML anymore. Like, would you redo the entire, would, uh, WXR to, I mean. Even though the R in there is for RSS, which is an XML thing.

[00:20:12] Jason Cosper: Uh, would this be JSON? Would this be like, what would it look like? Uh, there was a really interesting conversation. I kind of, uh, shared this around. I shared it in our discord. I shared it over in the, uh, post status Slack. Uh, sorry for cheating on our discord. Um. And, uh, you know, had some people chiming in, like Ryan McHugh, Ryan McHugh, who, uh, wrote and, and was writing an improved WordPress importer that like really kick the crap out of the default WordPress importer that handled, uh, things a little more dynamically that didn’t, uh, you know, run into like timeouts.

[00:20:59] Jason Cosper: So if you have like a large. Um, WXR file, uh, that importer, it’s, it’s not very good. Like you basically have to like break out WPCLI. And I, I know since we’re not on dev branch, everybody that’s watching right now is going, no, WPCLI makes me uncomfortable.

[00:21:20] Jason Tucker: Right.

[00:21:21] Jason Cosper: Um, so like you shouldn’t have to break out the command line to get your site imported and, uh, Ryan kind of, uh, speaking of, of running out of spoons, ran out of spoons to keep, um, working.

[00:21:37] Jason Cosper: On this project, because, uh, I mean, it still works. Uh, Ryan’s, uh, like importer plugin, uh, still works, still works fairly well. Uh, and imports a lot of WXR files that I’ve seen that are maybe too beefy, whatever. Um. For, you know, like a standard, like web based import. Um, but one of the things that Ryan, uh, said, uh, cause let’s see, who was it?

[00:22:09] Jason Cosper: Um, I’m looking at my thread here. Uh, Amanda Carson, um, suggested that, uh, the. You know, maybe we should, uh, like do JSON. Uh, and Ryan argued back and, uh, I expect Ryan to know a lot about this and, uh, be able to, to give a cogent argument here. Uh, Ryan argued back that. Um, actually WordPress, uh, and just generally, uh, processing JSON is a lot more resource intensive than processing XML.

[00:22:48] Jason Cosper: Um, for large scale data, XML is actually better than JSON as PHP’s tools for parsing streaming data are much better for XML than JSON. Uh, if anything, JSON would be less of a standard than the current WXR XML based format. Um, So, and. Yeah, and he linked, and I’ll make sure this also makes it into the show notes, a, uh, make WordPress core blog posts back from like 2015.

[00:23:20] Jason Cosper: So almost 10 years, nine years old, um, about how. it’s still, and things are still at the state, uh, where XML is just kind of, uh, uh, a quicker, easier thing, uh, to import. Um, but I, I really think that just given the state of things, given the problems, especially that you ran into, uh, getting something that, you know, it’s like data portability, yay, let’s liberate our data,

[00:23:56] Jason Tucker: Right?

[00:23:57] Jason Cosper: To be a little bit of a pain in the ass to get your data out of

[00:24:01] Jason Tucker: Yeah. There’s a lot of cleanup that happened. I, You know, there was some rejects I had to come up with it to go through and start cleaning some stuff up. And the other thing is is that data is now Being hosted someplace else that I don’t have a WP CLI type, you know, a CLI type thing to get in there and mess with it.

[00:24:22] Jason Tucker: So I would have to take that data then in state process it and then push it. You know, so it’s a, it’s a, it’s a bit different there. Um, but yeah, I, I’m, I’m still, I’m still kind of hung up on the fact that the data, that this data that we’re moving does not include video. It does not include audio, does not include photos, does not include any of that stuff that’s stored in a short code a hundred years ago.

[00:24:50] Jason Tucker: It doesn’t, there’s a whole bunch of these things that are like missing out of it. Even if you put like inline JavaScript. Is it grabbing the references to that JavaScript? Like there’s a whole bunch of these like pieces that I don’t know if it’s, if it’s picking up or not. And is it going to make it onto the other side?

[00:25:09] Jason Tucker: I don’t, I don’t know. We assume that everything is just, you know, a blog post talk about what, you know. What you’re doing, how, what you’re eating, what you’re listening to, but there’s plenty of other blog posts that are a little bit more data heavy that we may be missing some of that data, you know,

[00:25:26] Jason Cosper: Right, and, and see, one of the things

[00:25:29] Jason Tucker: active X components, man.

[00:25:32] Jason Cosper: sure, sure. See, and, and this is the, the kind of big thing of it all, you and I are professionals,

[00:25:42] Jason Tucker: Right.

[00:25:43] Jason Cosper: we are okay with diving in there, like, you know, getting elbows deep in a problem and, uh, you know, either, uh, coding something up ourselves or, uh, we know the right things to ask some sort of AI assistant to help us like code a solution.

[00:26:02] Jason Cosper: And, uh, To, to get us to where we need to be. Um, like, but for the most part, we’re doing all this stuff ourselves. Most people are not just doing this themselves. They’re throwing their hands up and they’re like, well, like, I guess I still am just going to use WordPress. And as far as the WordPress project is concerned, that’s fine.

[00:26:26] Jason Cosper: They want you to keep using WordPress. Um, but this is. Uh, this sort of like lock in, uh, that’s happening, this, uh, attention that’s not being paid to make sure that data liberation works both ways. Um, is, is kind of another, uh, symptom and it’s a symptom of, uh, surprisingly in, in even open source projects, this happens, uh, of what my friend Corey Doctorow calls in shitification, like, um, things are getting.

[00:27:01] Jason Cosper: Worse like, because, uh, you know, WordPress now at, uh, almost half of the internet, uh, is an entrenched player. Um, and being able to not take your data, uh, and, and make sure that you get the whole thing and move it over to a circuit, a service like micro dot blog, move it over to another one of these content management systems, like.

[00:27:29] Jason Cosper: Craft CMS or Kirby or something like that, uh, and, and have to do all of this, uh, recovery and, uh, you know, and, and managing of data and getting elbows deep on all this stuff is, it’s just not a good look, man. Like, come on, we, we could, and should be doing better. Uh, and I, I think that, um, uh, the things that we’re doing, um, you know, being able, um, to let people have ways to freely move their stuff around. Um, you know, so they can have, uh, a Hugo phase, they can have. Uh, a WordPress phase, like, um, you know, we need to not be locking people in to like giving them a way to move things easily into our system.

[00:28:28] Jason Cosper: And then it’s like, well, you’re here now. And, you know, if you don’t

[00:28:33] Jason Tucker: all the

[00:28:33] Jason Cosper: it,

[00:28:34] Jason Tucker: Yeah. And think of all the cruft you have to, you, you, you end up picking up as you go from system to system, because you know, this data, you know, this, if you go to my first blog post, that first blog post is from LiveJournal. You know, yeah. And the, in the posts before that, there’s some other posts that are in there as well, that were from, um, movable type they were from, um, geez, uh, they were from, uh, all sorts of stuff.

[00:29:05] Jason Tucker: Like any of the early B2 stuff that’s in there too. There’s some mambo data that’s in there. Like name, the CMS, man, I’ve, I’ve touched, I’ve touched a lot of the old ones and now this newer, newer ones. I’m like, Oh, you know, maybe this data does need to go live in like ghost for a minute. Let’s see what happens.

[00:29:24] Jason Cosper: Right. See, and, and that’s, that’s the other, uh, big thing is like, you’ve been migrating this stuff. Like I, uh, have taken a different approach when, uh, things like start to get too heavy for me, uh, and blowed out too much, like for, um, you know, 10, 15 years, uh, I was using Jason Cosper. com. Uh, previously, uh, on the domain, what was it?

[00:29:51] Jason Cosper: Oh, no. Send I. info because, uh, William Gibson fan and everything else. Um, like that’s the only time I’ve ever moved anything, uh, everything else. Um, You know, I, I just go, okay, well, the life of this is over. I’m moving on to like the next domain, the next project, everything else. Uh, and like, I never moved my old blogger blog posts into, into my WordPress install half, because a lot of those old posts, uh, and gosh, my old live journal posts, super cringe.

[00:30:31] Jason Tucker: Oh yeah. All this is super cringe, dude. And there’s a lot of this where I was like, I’m hiding all of these, they exist and they will be moved, but I’m just hiding them from the public. But yeah, that, the other thing I was thinking about, uh, regarding this, is that’s something that like, just, you know, the, the normie folks that would have to deal with is the, this idea of WordPress building, you know, uh, 15, um, versions of an image file.

[00:30:56] Jason Tucker: So that way it can customly put it in a, in the correct spot, hopefully, depending on how the theme is, is built and how it’s being implemented, but which one of those is the one that gets moved?

[00:31:08] Jason Cosper: Right.

[00:31:09] Jason Tucker: Is it, is it the full

[00:31:11] Jason Cosper: It should be the full size.

[00:31:13] Jason Tucker: hopefully, you know, and is it, is it the full size? Is it, is it the web web P version of it?

[00:31:21] Jason Tucker: That’s some, you know, um, image optimizer made. Is it, um, I don’t know, like what, like what is it like what’s, what’s being moved? And, and how do I deal with that when it gets moved to the next spot? You know, maybe the new system doesn’t support web P and it goes like, what the heck is this? Why, why’d you give me this?

[00:31:42] Jason Tucker: I, we haven’t, we haven’t updated to that part yet. So, you know, there, there could be those types of things that crop up as well.

[00:31:49] Jason Cosper: Yeah, no, absolutely. I, I am pretty sure it, it should, but again, I, I put air quotes around that should, it should, um, you know, send the original image, uh, I’m sure. Um, you know, someone in the comments, someone on the discord will, will chime in, uh, you know, if that’s the case or not, but like,

[00:32:14] Jason Tucker: Mm hmm.

[00:32:15] Jason Cosper: we should have a very, and no air quotes around this.

[00:32:19] Jason Cosper: We should, we really, really fucking should have a way to verify that, like, you know. The original image, uh, you know, the original metadata, uh, cause crap, we haven’t even brought up fields yet. Like, um, you

[00:32:36] Jason Tucker: Or, or our favorite post formats.

[00:32:40] Jason Cosper: yes,

[00:32:42] Jason Tucker: If you’re going to take a Tumblr and you’re going to bring it into something else, there’s some post formats that are involved. Who would’ve thought? Mm

[00:32:52] Jason Cosper: if you’re, uh, one of those, uh, rare, weird folks who uses, uh, like the chat post format. Or, um, you know, any, any of the other kind of like edge, uh, ones that aren’t like an audio post, uh, an image post, a video post, uh, even those have their own little complications. Like where is that video file hosted?

[00:33:19] Jason Cosper: Is it coming along? Um, but

[00:33:21] Jason Tucker: the company go out of business? That happened to me three times.

[00:33:25] Jason Cosper: what,

[00:33:26] Jason Tucker: R. I. P. Utters.

[00:33:28] Jason Cosper: wow.

[00:33:31] Jason Tucker: Yeah. Yeah.

[00:33:34] Jason Cosper: Yeah.

[00:33:34] Jason Tucker: was sad. I have a lot of audio that, that didn’t make it into some, um, you know, somebody’s, uh, you know, somebody’s, uh, uh, selling of their company or closing their company or whatever, and all of your data just poof, disappears. And it’s like, oh, all right, cool.

[00:33:51] Jason Cosper: and, and that’s, that’s the big thing is, um, I mean, that, that classic saying of, uh, you know, the cloud, uh, when, you know, you’re say you’re putting something on the cloud, like you just need to remember that the cloud is really just someone else’s computer.

[00:34:08] Jason Tucker: Mm hmm.

[00:34:09] Jason Cosper: And, uh, I know a big part of this whole, uh, data liberation push is to like, put it on your computer, but honestly, and I’m saying this as somebody who works at a web hosting company, uh, you’re still paying someone else to, so like, unless you have a Raspberry PI or, uh, you know, some sort of Linux box, like sitting on your desk, you are paying to host this stuff on someone else’s computer, uh, and being able.

[00:34:39] Jason Cosper: Like, you know, yes, you own the domain. Uh, yes. You everything else, but like, you should be able, uh, to, to move your stuff around, like you need to be able to like, get your content out of these sites. Like when it’s folding, when it’s, uh, you know, going under, um, yeah. So,

[00:35:00] Jason Tucker: bad part is, is with that is that, you know, let’s just say for instance, this, this others thing, the idea was that, you know, you use their service to like record an audio file and then the audio file gets stored on their side. Then they go and give you a link that you just put on your website and it just. That’s great until that company goes out of business. And so how are you supposed to be the one to go and grab that, that source data and then, and, you know, put it into your website, you have to go through all those posts and update all this stuff. Um, there’s no good way to do it, you know, and this is where I was going back to

[00:35:38] Jason Cosper: you just made me,

[00:35:40] Jason Tucker: how do you do this for like.

[00:35:42] Jason Tucker: Any of those other things that are, that are embedded into your website. It almost sounds like you need to scrape it from the last part of this versus somewhere in the middle where you end up with a bunch of short codes or something.

[00:35:53] Jason Cosper: right. Uh, I mean, you, you just made me think you, you brought up, uh, udders and it’s, it’s like rattling around

[00:36:00] Jason Tucker: know, right?

[00:36:01] Jason Cosper: a little familiar. And I’m, I’m going to drop one that maybe more folks in the audience are familiar with, but probably not, uh, in the early days of blogger, uh, audio blogger, odd

[00:36:13] Jason Tucker: Yes.

[00:36:13] Jason Cosper: Um, which was a Noah Glass project, uh, and eventually, uh, got picked up by Odeo, which is the company that eventually turned into Twitter. Um, yeah, but, uh, I mean, Oddblog, like, folded. And, uh, I remember making, um, a bunch of little, um, you know, posts. And now, like, if I go back and look at the, the export that I did on blogger of all my old blogger posts that, uh, live on a domain that no one will ever find out about, cause like I said, all those posts are pretty cringe. Uh, at least those cringe audio posts don’t live on. Um,

[00:36:57] Jason Tucker: I have to show you an old YouTube video. I put, I leave all that stuff out there, dude. I don’t care.

[00:37:04] Jason Cosper: Right. No, you know, the, the more I think about it, the more, yeah, uh, this, this whole, uh, uh, right to disappear thing that the EU has, or the, the right to be forgotten, uh, I think they’re onto something.

[00:37:20] Jason Tucker: are onto something. Yep.

[00:37:22] Jason Cosper: Yeah.

[00:37:23] Jason Tucker: can they actually get it to happen? That’s the other thing, you know, they just have to wait for like companies to go out of business. And then yes, you were forgotten. Good job.

[00:37:32] Jason Cosper: Right. Yeah. Well, I know that in the UK, like Google has to get rid of it. Uh, you know, like if you’re, if you’re like an, or not, I’m sorry, not the UK, the EU, um, so like in, in the EU, um, like if you make a right to be forgotten request to, um, to Google, to a few of those other places, like they have to take down, uh, the data that like corresponds to you, or they

[00:37:59] Jason Tucker: Mm hmm.

[00:38:00] Jason Cosper: The best possible, uh, I, I mean, we just got to wait for, uh, a state like, I mean, it’s, it’s, that’s not happening in the United States, at least not yet.

[00:38:10] Jason Cosper: But, uh, if it does happen, it’ll end up happening in a state, probably like California, uh, where, yeah, where we have, um, you know, certain data protections, et cetera, uh, that other parts of the country don’t have. Um, so,

[00:38:30] Jason Tucker: Could I bring up one more thing regarding you? You said Google and it made me think about something. So Google has this process called Google takeout. The idea is that you go in, uh, you go and log into your interface and you say, Hey, I want to, I want to download my, anything and everything that exists about my account and let you download it.

[00:38:48] Jason Tucker: And then you can take that data and go to another Google account and import it back in. But the thing is, is like that data is pretty close to useless by itself. Right. Like it’s pretty close to useless by itself. Do you think that WordPress should have a Google takeout for that data as well? Where it just says, cause like we keep talking about just like the, the root text data, you know, this, like just this one text file, that’s going to have all these links to go to different places and whatnot, but that doesn’t give you all the data that just gives you just that text file.

[00:39:27] Jason Tucker: It’s light, portable, easy to move around. But what do you, what about all of that stuff? And what do you, how do you do that? I mean, Twitter, we just went through this with Twitter. If you went to Twitter and you said, give me my download, you download this, like. Blob for lack of a better word of just none of it actually talks to each other.

[00:39:48] Jason Tucker: There may be some HTML files that are referencing things or who knows. I mean, I think I’d looked at it like once or twice, like way early on, but that data is not, that data is pretty useless. Somebody can go and write a thing that would read it and then be imported into something. But, um, what about that?

[00:40:06] Jason Tucker: What about this idea of, of doing some type of takeout of, of, of it? So you get. Everything instead of having to FTP in and grab, you know, certain copies of, you know, the small, medium, large, and, uh, and full size stuff. Like there has to be a way to parse through this because, you know, just cause WordPress thought it was a good idea to make, you know, nine versions of a, of an image file that you could at least get a smaller slimmed down version that’s like only the big files and nothing else.

[00:40:36] Jason Cosper: yeah, I, I think that, um, and maybe this is going back a little bit to like, um, text bundle, uh, having something that is effectively, uh, a takeout of your data that’s in there, uh,

[00:40:49] Jason Tucker: Mm hmm.

[00:40:51] Jason Cosper: uh, having, um, you know, the things that correspond, I, I remember adding, um, YouTube. Um, links used to have to be embedded in short codes, stuff like that.

[00:41:04] Jason Cosper: And now, uh, it’s just a, an O embed, uh, as long as that YouTube link is still up, like the video will still be there. Um, like, so having, I, I think it would be an interesting, uh, experiment to have a, a tool that would pre render the short codes that would get rid of, um, Like all the styling data around. So it would just, uh, almost like a modified, um, I mean, you’d have to use a couple components, but, um, a modified, um, like, um, like WP to static type plugin, uh, that would output.

[00:41:50] Jason Cosper: Um, this is like the HTML for the page with the, the rendered shortcode bits in there with this, with, uh, you know, the appropriate images, et cetera, um, make a takeout. Uh, I, I would argue go. Uh, you know, one step further, but again, this is getting into like, you’d have to run like a command line script or whatever.

[00:42:16] Jason Cosper: Uh, but having a, a takeout that, uh, ran a program like, uh, YT DLP, uh, and grabbed, uh, those, uh, yeah, all the YouTube video, I mean, that, that runs on so many other

[00:42:34] Jason Tucker: my gosh.

[00:42:35] Jason Cosper: even funny, uh, you know. It would have grabbed, uh, conceivably those utters posts that would have grabbed, uh, so like any referenced data, like grab me, um, I don’t have the energy.

[00:42:53] Jason Cosper: I don’t have to, to make this project. Uh, but I definitely think that it would be, um, an interesting thing to have a way to actually get, uh, a proper, like full export, uh, of the stuff on your site. Um, yeah,

[00:43:14] Jason Tucker: a reverse. It’s like a reverse movable type. It’s pretty much what, what, what it is. Cause you’re, you’re, instead of taking a bunch of text and then turning it into HTML with Markdown and all that fun stuff and having it process it, you’re reverse processing it to get it back to a spot that you could then make it portable and, and.

[00:43:33] Jason Tucker: I’d hate to move that truck around, but you know, it’s going to be a lot of stuff that’s going to be in there, but yeah, you would essentially have a snapshot of your entire, uh, existence in that blog posts or in those blog posts on that website.

[00:43:46] Jason Cosper: Right. I, I, um, if anyone were to do this project, uh, I absolutely would want to use your site as, uh, one of the test instances, cause I’m sure that the, uh, the number of things that have been thrown in there over the years, the number of, uh, stuff you have to account for, uh, would probably cover most scenarios, uh, A lot of folks, maybe, uh, maybe a hundred, 200 blog posts.

[00:44:23] Jason Cosper: Like if that, uh, you know, most folks it’s maybe a dozen. Uh, but I mean, just everything you’ve done over the past, how, I mean, you know, you’re going back to, to live journal, to, to movable type, to B2, like, yeah, it’s going to be a lot of stuff in there.

[00:44:43] Jason Tucker: Yeah. I think so. It was like, it would be a lot of stuff.

[00:44:48] Jason Cosper: See, say, say, didn’t join us today. Uh, I’m, I’m sad that she’s not here. Um, but, and I was, I was concerned. I was like, how are we going to manage to fill half an hour and look at us? We’re 15 minutes over.

[00:45:04] Jason Tucker: Oh my gosh.

[00:45:06] Jason Cosper: Yeah, so I mean, this is, this is a topic that has a lot of legs to like, you know, um, I, I feel like we could, we could just keep going, but, uh, I, I value your time.

[00:45:23] Jason Cosper: I value my time and I value our listeners and viewers time. So, uh, Tucker, how about you hit that outro button?

[00:45:30] Jason Tucker: I shall hit the outro button. As soon as I find. Speaking of that outro button, I have pressed it and this is the music that plays during that outro. You can go over to our website at www. debutwatercooler. com slash subscribe and subscribe to all the shows that we have going on over there. Plenty of things happening over there and I’m hoping that you will enjoy the content that we put out.

[00:45:54] Jason Tucker: Thank you very much. Talk to you all later. Bye bye.

Show More Show Less

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.