Just want to make sure I understand. Even if the payload is removed from the image, it should be protected with a password. Assuming the contents are secure, there should be no reason to remove the payload. Is that right?
Assuming the contents are secure, there should be no reason to remove the payload.
Right, I personally don’t see reason why someone would want to strip a data carrier of its embedded content. I’m postulating that it could happen.
The data carrier is publicly accessible anyways; anyone can copy those bits – directly or indirectly – and use them as they please (“Save As”).
This applies for all tokenized media except for encrypted content (i.e. the public-facing bits are already encrypted.)
I’ve been thinking about the technical details of the proposed NFT standard and I have some questions about the implementation, mostly with respect to limitations and design decisions. I apologize if my ramblings are incoherent.
Perhaps I’m boiling the ocean with my approach; ultimately, I’m trying to achieve a use case for steganography that is more than just single-use consumables.
Maybe I should just focus on the simplest implementation of steganography (LSB) and call it a day.
Some things to consider:
- What if we want to temporarily or permanently reveal some of that embedded information to the world, or to certain people?
- Context: medical records stored in the NFT. I want my doctor to have limited access to this information.
- Alt context: OP’s post and the Multi-Image NFT.
- Considerations:
- multisig for NFT platforms
- Certificate Authorities to revoke permissions after a time period has expired
- Do we want to embed an encrypted payload in every token that’s minted, even if the payload is empty?
- Context: anti-steganalysis mechanism where even if a data carrier is correctly identified to be a Zenon NFT, an adversary wouldn’t know if the embedded data is useful or not.
- Counterpoint: If an adversary suspects there is embedded data in a carrier, steganography has failed its purpose.
- Perhaps we will have two types of NFTs: immutable and mutable (as suggested by George)
- Note: I am hard-pressed to identify a reason to combine steganography with an immutable NFT beyond one-time consumables.
- If we do implement steganography in every token issued on NoM, then token transferability is limited to applications that can transcode the asset. Mobile and browser wallets will likely not be able to perform this task, or they may be limited to certain file types due to processing limitations.
- The amount of data that can be embedded is largely bounded by the size of the container and the encoding method. I found an example of lossless steganography but it has some downsides that I think are unsuitable for our project (listed in section IV.)
- Steganography tends to be lossy; the original bits are being manipulated and this can introduce visual artifacts that are perceivable to the human eye.
- As more of the original data is replaced, embedded capacity increases, but so does “noise” in the carrier data.
- My pseudo-steganography PoC enables virtually limitless container sizes, introduces no noise in the carrier data, but the presence of embedded data is trivial to detect.
Here’s some literature that I found interesting, particularly from page 16 onwards: Magic LSB Substitution Method (M-LSB-SM)
ZK and homomorphic encryption might be relevant for the privacy preserving medical records use case. Here’s a neat implementation on eth to make data types private:
Seems to me that multi sig built in could have a big impact on use cases for tokenizing real world assets or just shared digital ones.
I dont know about the multi image idea though, other than digital collectables or consumables not sure about practical usecases, more so just creative. What can you do with a multi image nft that you cant do with multiple seperate images and a smart contract?
What i keep coming back to is the special attestation idea and how useful/important it would be to be able to easily assert your ownership of an NFT on chain but also understood this was mainly done thru the metadata.
If this function needs steganography then every token minted should have the capacity to hold an encrypted payload but it also makes the practise public so doesnt really make as much sense.
If the assertion can be made outside of any steganography tools then more effort could be made into hiding the embedded data, which the Magic LSB Method sounded pretty good at.
I revisited the NFT Standard article and determined that my initial idea of pseudo-steganography is likely the only viable way to achieve Mr Kaine’s vision. I don’t see any other way, based on research I’ve done about modern steganography capabilities.
I’ll explain why in this post.
TL;DR
I propose that we concatenate an encrypted, variable size data container to carrier data. We may potentially use LSB to embed the Virtual Hologram in the carrier data.
Excerpt from the Medium Article:
When you create an NFT, you should have the option to add additional content that can only be revealed by the NFT holder. Such content can be anything.
High definition content, messages, video content, access to secret communities, discount codes, or even smart contracts, treasures, and easter eggs are all potential candidates for unlockable content.
Mr Kaine knows people will be minting JPEGs, but stipulates that the solution should permit any content to be embedded in the carrier data.
I only know of one way this can be achieved.
Since the embedded data can be of variable size and can potentially be greater than the size of the original carrier data, we must find a steganographic technique that:
- is extremely flexible,
- in terms of file size, file type
- is cryptographically malleable
- in terms of asset transferability
- is not constrained by the size of the input data
- can be parsed by anyone to certify authenticity (Spectral Attestation)
- ideally, is accessible to mobile wallets and devices with less compute capabilities
A core limitation of modern steganographic techniques is the size of the carrier.
Generally, steganography introduces changes in the original file’s data in order to hide other data; replacing original data with anything else tends to lead to data loss.
You can’t replace bits and expect all the information to remain intact.
Of course, slight changes to the file may be imperceivable to humans, but there’s a relation between how much data that can be stuffed in carrier bits and visual/auditory/spatial degradation of carrier data quality.
Low impact to original bits
- more likely to maintain original file quality
- less capacity for the embedded file
Higher impact to original bits
- less likely to maintain original file quality
- more capacity for the embedded file
This is why steganography is generally lossy. Same applies to compression.
About lossless steganography...
There are few lossless solutions and they do not fit our needs. They rely on the chance that the carrier data contains the same subset of bits as the embedded data.
I’m not confident this will work for every file and users cannot be limited by their selection of files.
If you find any lossless solutions that could work in our context, please let me know.
Mr Kaine outlines two options in his article: LSB and BPCS.
LSB is primitive, popular, easy to implement for many file types, but it has the embedded data size constraint.
BPCS is a newer technique that depends on spatial noise in the carrier data to hide information, and it seems to be only applicable to image files.
It works well with image data of a realistic setting with lots of different shades, colors, edges, objects, etc, but fares poorly with flat, cartoon-like graphics.
It takes advantage of visual chaos; cartoons tend to have large areas of the same pixel color.
For example, BPCS would offer much higher capacity for the image on the left.
The final reason why those two methods won’t work – We need to encrypt the embedded contents, because our software is open source and the ledger is public.
A malicious actor will scrape all the carrier data by querying the ledger and extract the embedded payloads of each NFT.
I’ve tested AES-256-CBC and AES-256-GCM in Dart; we should expect ~4x file size output or greater. I could try other encryption schemes, as well.
Some results from a few tests:
Show results
Input files
- input.png = 180 KB
- secret.png = 120 KB
- secret.mp4 = 17230 KB
Unencrypted Output
- input.png + secret.png => output.png = 299 KB
- input.png + secret.mp4 => output.png = 17410 KB
AES-256-CBC Output
- input.png + encrypted(secret.png) => output.png = 907 KB
- input.png + encrypted(secret.mp4) => output.png = 105174 KB
AES-256-GCM Output
- input.png + encrypted(secret.png) => output.png = 908 KB
- input.png + encrypted(secret.mp4) => output.png = 78925 KB
Here’s some binwalk output, they’re all the same:
Show binwalk output
$ binwalk output.png
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 PNG image, 900 x 900, 8-bit/color RGBA, non-interlaced
41 0x29 Zlib compressed data, compressed
Visual representation of my proposal:
Note: The Virtual Hologram (VH) is likely to be a small value, like the container’s header (H) and trailer (T).
Alternatively, since the Virtual Hologram may be short and a fixed length (tbd), we could potentially leverage LSB to embed that in the carrier data for all file types.
I’ve updated my pseudo-steganography PoC but haven’t published it yet.
There are still some cryptography-related logistics I need to work out.
I welcome any feedback regarding my posts on this topic. I’m seeking criticism of the approach, in case there are flaws I haven’t considered.
If I don’t hear from anyone, I might just “do it myself.”
I have thoughts on this, but need time to sit down and respond. Agree with most of your points
Are there any examples of this solution used in production? I’ll google around, but this was the first thing that came to mind. Seems like a viable option w/out doing any research myself.
Interesting links I found. I think I understand this. It’s like TrueCrypt. It’s a simple encrypted payload of variable size that is added to anything and does not impact the original thing. It’s like an encrypted wart that you cannot see and can hide stuff in. Is that about correct?
Why does it 4x the original size? Isn’t the bloat a function of what you are adding to the encryption package? For example: If I add the string “deeZNNutz” to a 100kb image will it 4x the image size?
Same for me. Very interesting ideas.
Yes, it’s like a TrueCrypt container.
A quick test on my end shows that using a small text file as the secret input will not greatly impact the output file size.
Example:
input.png = 180 KB
secret.txt = 1 KB
unencrypted and encrypted output were both 180 KB
this is a pretty good idea. Can the contents of the container be changed? Not sure why we would need to do that. But I assume a public key would be added the the container. And that, along with the private key, will confirm ownership of the digital asset. If the contents of container are tampered with the private key will not authenticate ownership.
And if someone loses the image it’s like losing a private key? What is to prevent someone from removing the container and attaching it to another image? I assume the key generation is a function of the hash of the object the container is attached to?
This is cool.
I’m not aware of a commercial software that offers this steganographic solution, but maybe someone has already tried this?
Yes, it’s dynamic and can be updated by the token holder.
As for why we want this – let’s aim for multi-use NFTs as opposed to single-use.
I’m still thinking of how this will impact the NFT record on-chain.
Yes, but hopefully the carrier is uploaded to a reputable storage service.
Nothing. Some actors may try to extract and bruteforce these containers. This is the tradeoff for having open source code, a public ledger, and a project criteria to embed any amount of data.
I’m still working on this part, but ideas for key management are welcome.
Awesome. I can dig around for key generation articles too.
But to be clear, the goal is to make sure an encrypted container is attached to a specific “thing A”. And if we remove the encrypted container and attach it to “thing B” it will not be authentic.
Correct?
Spectral Attestation could check a hash that is part of the NFT’s record on-chain.
hash(output with embedded secret)
If someone manages to append the secret container to another file, good for them. They still need the private keys and the ledger won’t validate the imposter file is authentic.
I like this idea a lot. Encrypted Warts.
EW standard.
Here are some research links to read.
- https://www.coindesk.com/sponsored-content/encrypted-variable-tokens-the-next-generation-of-innovative-media-assets/
- https://www.newtonproject.org/en/evt/
- The Future of NFT Is EVT, the New Game Changer Token – Press release Bitcoin News
- “NFTs are static, while EVTs are dynamic. EVTs allow certain aspects of the metadata to be re-programmed. Ultimately, EVT functionalities solve the residual royalty problem for creators. With EVTs, a creator can continuously enjoy a percentage of royalties as the content/metadata continues to be traded. NFTs weren’t designed this way because of security issues surrounding the coded language.”
- (Not so relevant) https://www.researchgate.net/publication/356339205_Understanding_Security_Issues_in_the_NFT_Ecosystem
- How to Make NFTs Secure? - Merehead (Not applicable)
- https://apptainer.org/ (Encrypted docker containers. Just FYI)
- Wonder if we should submit a ZIP (ERC-721: Non-Fungible Token Standard)
Not much out there on embedded encrypted data in NFTs.
Interesting discussion about Sign in with Ethereum where they discuss data vaults to store information.
These guys are working on the identity standard
Wonder if there is anything we can learn about their data vault.
I checked your PoC and tried to familiarize myself a little with these concepts.
I understood you’re proposing to just add an encrypted payload to the file and the solution does not necessarily need to use LSB or BPCS. Would the benefit of using LSB or BPCS be that it would be harder for an outsider to determine the file has something hidden in it?
Why would encrypting the embedded contents make those methods not work?