• Resolved sdai

    (@sdai)


    I found a thread here talking about XMP metadata, but adding XMP metadata to files is a huge headache compared to EXIF. My files have EXIF metadata, which is currently being read “successfully”, but not being parsed correctly. Here’s what I get. This is the UserComment field, but other fields also appear in this section equally garbled.

    post_id => 3656
    id3:GETID3_VERSION => 1.9.22-202207161647
    id3:filesize => 25648
    id3:filepath => /bitnami/wordpress/wp-content/uploads/2023/02
    id3:filename => testjpeg.webp
    id3:filenamepath => /bitnami/wordpress/wp-content/uploads/2023/02/testjpeg.webp
    id3:avdataoffset => 0
    id3:avdataend => 25648
    id3:fileformat => webp
    id3:video.resolution_x => 512
    id3:video.resolution_y => 512
    id3:encoding => UTF-8
    id3:mime_type => image/webp
    id3:riff.header_size => 25640
    id3:riff.WEBP.VP8X.offset => 12
    id3:riff.WEBP.VP8X.size => 10
    id3:riff.WEBP.VP8X.data => ???????
    id3:riff.WEBP.VP8 .offset => 30
    id3:riff.WEBP.VP8 .size => 25530
    id3:riff.WEBP.VP8 .keyframe => 1
    id3:riff.WEBP.VP8 .version => 0
    id3:riff.WEBP.VP8 .show_frame => 0
    id3:riff.WEBP.VP8 .data_bytes => 101584
    id3:riff.WEBP.VP8 .scale_x => 0
    id3:riff.WEBP.VP8 .width => 512
    id3:riff.WEBP.VP8 .scale_y => 0
    id3:riff.WEBP.VP8 .height => 512
    id3:riff.WEBP.EXIF.offset => 25568
    id3:riff.WEBP.EXIF.size => 72
    id3:riff.WEBP.EXIF.data => Exif??MM?*?????i?????????????????????(this is my comment smile?
    id3:riff.encoding => ISO-8859-1
    id3:playtime_seconds => 0
    id3:width => 512
    id3:height => 512
    id3:post_id => 3656

    I tried using about 50 different variants of template:([+id3:riff.WEBP.EXIF.data,single+]) but I wasnt able to get anything to work, and even if it did I expect it would grab some garbled data anyway.

    Is there a way I can parse this data into a field? I am creating the files myself in python using Pillow, so I am quite flexible, but I need to pass data via metadata somehow. For now I am using jpegs and it works, but I want to start using webp.

    • This topic was modified 2 years, 1 month ago by sdai.
Viewing 15 replies - 1 through 15 (of 20 total)
  • Thread Starter sdai

    (@sdai)

    Conversely here is the exact same file saved as jpeg

    post_id => 3659
    exif:FileName => testjpeg.jpeg
    exif:FileDateTime => 1676112510
    exif:FileSize => 50414
    exif:FileType => 2
    exif:MimeType => image/jpeg
    exif:SectionsFound => ANY_TAG, IFD0
    exif:COMPUTED.html => width="512" height="512"
    exif:COMPUTED.Height => 512
    exif:COMPUTED.Width => 512
    exif:COMPUTED.IsColor => 1
    exif:COMPUTED.ByteOrderMotorola => 1
    exif:COMPUTED.UserComment => this is my comment smile
    exif:UserComment => this is my comment smile

    • This reply was modified 2 years, 1 month ago by sdai.
    Plugin Author David Lingren

    (@dglingren)

    Thanks for your report and for posting the details of the metadata in your files; very helpful. The webp format is a recent development and I do not have much experience with it.

    If you could post a link to one or more images in that format with metadata such as the example above, I will see what I can do to improve your results. Thanks for your help and your interest in the plugin.

    Plugin Author David Lingren

    (@dglingren)

    I would also be interested in knowing more about how you added the EXIF data to the files. I have not been able to find a tool that does this. Thanks!

    Thread Starter sdai

    (@sdai)

    Hi David, thanks for your reply. I am using the Pillow module in python to do this, however as a small disclaimer I cannot say with 100% certainty that I am doing it the correct way. I’ve done a few hours of googling over the past couple of days and come up with not much about the “intended” way to implement metadata in webp, however this method does appear to work, and is very straightforward. Using tools such as https://jimpl.com/ to check the files shows that the data is added to the file.

    Here are two images, one with exif data set and one without: https://a.uguu.se/AEzTDjur.webp
    https://a.uguu.se/sgkMqNFG.webp

    The code used to generate the file:

    from PIL import Image
    
    def get_images():  
    
        rules = "this is my comment smile"
    
        first = Image.open("1.webp")
    
        first = first.convert("RGB")
    
        exif_data1 = first.getexif()
    
        exif_data1[0x9286] = rules
    
        first.save("test.webp", optimize = True, quality = 98, exif = exif_data1)
    
        exifed = Image.open("test.webp")
    
        exif_data2 = exifed.getexif()
    
        print(exif_data2)
    
    get_images()

    The result of a metadata check showing the comment: https://jimpl.com/results/oMZ8qY6T8HiXsXNLtaWtuFZo

    Hex codes for setting specific fields: https://exiv2.org/tags.html

    Thread Starter sdai

    (@sdai)

    Additionally, using Pillow to assign a “default” value (0x010f, camera make) and using exiv2 to read it (which I believe is the “standard” for manipulating metadata) appears to work correctly too. https://i.imgur.com/3q7iVbO.png

    Trying to figure out the “correct” way to do this is not easy though… I cant find a single thing about writing webp metadata programatically, beyond a couple of C# libraries

    • This reply was modified 2 years, 1 month ago by sdai.
    • This reply was modified 2 years, 1 month ago by sdai.
    Plugin Author David Lingren

    (@dglingren)

    Thank you for all the resources you posted and for your patience in awaiting progress. As you mentioned, it’s not easy to find any information about adding EXIF data to the webp “container”.

    I love a challenge, so I am working on a solution. After some false starts I have arrived at a promising approach:

    1. Generate a small JPEG image using the ImageMagick/IMagick PHP packages. These are the preferred components used by recent WordPress versions. Do you have them on your site?
    2. Take the EXIF data from ID3, add the header and splice it into the generated JPEG.
    3. Use the PHP exif_read_data() function to parse the EXIF data, converting hex codes to readable names, etc.
    4. Tidy up the parsed data to reflect the original webp file information and add it to MLA’s existing metadata elements.

    I will have a Development Version for you to try out shortly, but I wanted to make sure you have the required components in your WordPress install. Let me know what you think.

    Thread Starter sdai

    (@sdai)

    That’s a very clever solution! I did some digging and it did seem that the limitations of built in wordpress data parsing might make it difficult, but moving the data over to a jpeg and getting WP to parse that is very clever. I do have imagick on my server, I’d be happy to test your dev version, your solution sounds like it would work well.

    Plugin Author David Lingren

    (@dglingren)

    Thanks for your patience while I worked on a solution. I have uploaded a new MLA Development Version dated 20230218 that parses the metadata found in the ID3 array and places it in the existing exif: array. You can find step-by-step instructions for using the Development Version in this earlier topic:

    PHP Warning on media upload with Polylang

    Once the Development Version is installed you can try it out on your example image. I imagine your testing will generate questions and further improvements. It will be much easier to communicate by email and I encourage you to contact me at my web site so we can do that. I will post a summary here once we’ve worked out the details.

    The new feature will be part of my next MLA version, but in the interim it would be great if you could install the Development Version and let me know if it works for you. Thanks for inspiring this MLA enhancement.

    Thread Starter sdai

    (@sdai)

    Hi David, this works great for some files, but it seems that some files created with Pillow have a prefix on the exif data, which may need to be manually skipped. I have been looking for what could cause this prefix, but I cant seem to figure it out and it causes an error in parsing. If skipping over the Exif(?)(?)prefix where present is all that is needed then that might be a workaround, but some files do not have that prefix and I’m trying to find out where it is/isnt present.

    Here’s the parsed result:

    post_id => 3851
    exif:FileName => test2.webp
    exif:FileDateTime => 1676829753
    exif:FileSize => 57642
    exif:FileType => 2
    exif:MimeType => image/webp
    exif:SectionsFound =>
    exif:COMPUTED.html => width="550" height="368"
    exif:COMPUTED.Height => 368
    exif:COMPUTED.Width => 550
    exif:COMPUTED.IsColor => 1
    mla_exif_errors.0 => E_WARNING (2) exif_read_data(exif.jpg): Invalid TIFF alignment marker
    id3:GETID3_VERSION => 1.9.22-202207161647
    id3:filesize => 57642
    id3:filepath => /bitnami/wordpress/wp-content/uploads/2023/02
    id3:filename => test2.webp
    id3:filenamepath => /bitnami/wordpress/wp-content/uploads/2023/02/test2.webp
    id3:avdataoffset => 0
    id3:avdataend => 57642
    id3:fileformat => webp
    id3:video.resolution_x => 550
    id3:video.resolution_y => 368
    id3:encoding => UTF-8
    id3:mime_type => image/webp
    id3:riff.header_size => 57634
    id3:riff.WEBP.VP8X.offset => 12
    id3:riff.WEBP.VP8X.size => 10
    id3:riff.WEBP.VP8X.data => ???%?o?
    id3:riff.WEBP.VP8 .offset => 30
    id3:riff.WEBP.VP8 .size => 57534
    id3:riff.WEBP.VP8 .keyframe => 1
    id3:riff.WEBP.VP8 .version => 0
    id3:riff.WEBP.VP8 .show_frame => 0
    id3:riff.WEBP.VP8 .data_bytes => 134256
    id3:riff.WEBP.VP8 .scale_x => 0
    id3:riff.WEBP.VP8 .width => 550
    id3:riff.WEBP.VP8 .scale_y => 0
    id3:riff.WEBP.VP8 .height => 368
    id3:riff.WEBP.EXIF.offset => 57572
    id3:riff.WEBP.EXIF.size => 62
    id3:riff.WEBP.EXIF.data => Exif??MM?*???????????????this is my comment smiletest??
    id3:riff.encoding => ISO-8859-1
    id3:playtime_seconds => 0
    id3:mla_webp_exif_metadata.FileName => test2.webp
    id3:mla_webp_exif_metadata.FileDateTime => 1676829753
    id3:mla_webp_exif_metadata.FileSize => 57642
    id3:mla_webp_exif_metadata.FileType => 2
    id3:mla_webp_exif_metadata.MimeType => image/webp
    id3:mla_webp_exif_metadata.SectionsFound =>
    id3:mla_webp_exif_metadata.COMPUTED.html => width="550" height="368"
    id3:mla_webp_exif_metadata.COMPUTED.Height => 368
    id3:mla_webp_exif_metadata.COMPUTED.Width => 550
    id3:mla_webp_exif_metadata.COMPUTED.IsColor => 1
    id3:mla_webp_exif_errors => E_WARNING (2) exif_read_data(exif.jpg): Invalid TIFF alignment marker
    id3:width => 550
    id3:height => 368
    id3:post_id => 3851

    And the file itself: https://a.uguu.se/RkKgNxyx.webp

    My only thought is that I made some of these test files using a different method that doesn’t include the prefix, but it appears that every file that Pillow creates has it.

    • This reply was modified 2 years ago by sdai.
    Plugin Author David Lingren

    (@dglingren)

    Thanks for trying out the Development Version and reporting your results. Thanks as well for posting a link to one of the offending files; very helpful.

    The “prefix” you encountered is a standard part of the “APP1” header. I added it manually to your first example. It is trivial to test for the presence/absence of the prefix and handle both cases. I will do that and post an updated Devlopment Version.

    Again I encourage you to contact me at my web site to facilitate progress without cluttering this topic. Thanks!

    Plugin Author David Lingren

    (@dglingren)

    ?I have uploaded a new MLA Development Version dated 20230219 that parses the metadata found in the ID3 array with or without the prefix and places it in the existing?exif:?array.? Let me know if you find any other issues with the new version; thanks.

    Plugin Author David Lingren

    (@dglingren)

    @sdai – I am happy to report that I have worked with a second MLA user who has added EXIF metadata to WebP files (using a current version of Adobe Photoshop) and confirmed that MLA can extract it. Their files also contain XMP metadata, which MLA handled without any further modification.

    I will be releasing a new MLA version shortly after WordPress 6.2 goes live. If you have any problems or further questions regarding this topic please post them soon. Thank you.

    Thread Starter sdai

    (@sdai)

    Thank you for your work David, I was on a short break but I have confirmed that its working perfectly! Developers like you are what make the community great. I am happy to hear that it works for “real” webp files too!

    Plugin Author David Lingren

    (@dglingren)

    I have released MLA version 3.07, which contains the enhancements required for this topic.

    I am marking this topic resolved, but please update it if you have any problems or further questions regarding MLA’s support for metadata in WebP image files. Thanks for inspiring this MLA improvement.

    Hi David,

    I have installed MLA for the first time today and this is a really great plugin. Thank you very much for this!

    Because I don’t understand the technical details I am not sure if my problem is at the right place in this thread:

    The plugin works fine when I upload JPEG-files. I have filled the fields “Title” and “Caption” in Adobe Lightroom and the information appear in the media library as expected. I think Adobe LR write this information into IPTC fields.

    But I don’t want to use JPEG anymore because the files are a bit to heavy. So I convert all images to webp before uploading. The IPTC-Information still exist but MLA isn’t parsing anymore.

    Would be very great if you find a solution for this. Or do I have to configure the plugin in a special way? Possibly there is a checkbox or something what I have not seen?

    Thanks again for this great plugin and best regards
    Danyel

Viewing 15 replies - 1 through 15 (of 20 total)
  • The topic ‘Cannot extract EXIF metadata from webp’ is closed to new replies.