In the world of image processing research, it is important to have standard test images so people can compare their results. The cameraman image is a test image that has been used for decades. It can be found in many image processing textbooks and homework problems.
Imaging and photography has come quite a long ways since then. Just for fun, here are a couple pictures I took during my trip to Cairo Egypt for ICIP 2009 and a brief introduction to some research that Professor Sabine Süsstrunk presented on Near-Infrared Imaging to improve digital photography in the years to come.
Professor Sabine Süsstrunk does research at EPFL in her Images and Visual Representations Group on high dynamic range imaging using visible and near-infrared (NIR) spectrum. She takes two pictures of a scene. First, she takes a "normal" picture which captures the visible spectral components of the scene. Then, she uses special NIR filters on her hacked camera to capture the NIR spectral components of the scene. The NIR spectrum does better on a hazy day because it does not get scattered as visible spectrum does.
A "normal" digital photograph uses color lenses to capture the visible color of a scene. NIR does not capture color, but it does capture high frequency components. So, it takes some fancy image processing on the normal and NIR-captured images to create nice high-dynamic range images. Take a look at Sabine's web page on NIR imaging to learn more and see actual improved images, and to hear it directly from the source.
When Sabine gets back to the lab she will process her pictures of the pyramids using her research techniques. We should be able to compare her pictures of the pyramids to mine. It was a nice hazy day, so we should see some nice improvements!
In the meanwhile, take a look at the pictures I took of Sabine taking her NIR pictures. Do you think we have a new camerawoman test image?
Remember the good old days when you had a terminal screen and you typed ls, cd, and man? And, if you were a little more advanced, you might have used pushd, popd, cat, head, and tail. Well, there is a very alpha project called tweetsh. Tweetsh is a command-line shell interface for Twitter. It treats Twitter users/tweets as a big directory/file system and lets you access it with basic shell commands. Very cute and clever.
From what I can tell, this project is one guy in Amman-Jordan hacking for four days, so understandably there are still some bugs in it. But I think it's cool in that geeky sort of way, and TechCrunch thought it was noteworthy, too. And in the comments of the TechCrunch articles are a few other back-to-basics Twitter interfaces around such as a Ubiquity plugin for FireFox and a Twitter wrapper for emacs.
These are geeky cool, but I think they represent (or at least make me think about) a more significant trend. Let's take a closer look.
Technology, platforms, applications, and services for Social Networking
The computer was built on technology- computing and memory. The web was built on technology- computers, networking, and protocols. Blogging was built on more technology- web and syndication. All of these were created with the intention of being platforms that other applications and services would be built on.
Social networking primarily has been built as a service and application (built on the web), e.g., MySpace, Facebook, LinkedIn, Twitter. You could call them service platforms, because to varying degrees they are supporting APIs that other apps and services can be built on. I would say that of the four, Twitter takes the most platform-centric view from the UI level in that it made it easy for others to build user interfaces, applications, and services on top of it from the start. (Let's face it, tweetsh is actually a retro UI for Twitter.) Facebook (with Facebook Connect) and MySpace are gradually moving that way too. These are platforms at a service API level.
But now there is a more fundamental movement going on. Social networking and cloud services are being built into the fabric of Internet. More accurately, Social cloud technologies will be built into the fabric of the Internet.
Creating a stable social cloud technology platform for the Internet
Note that these social cloud technologies are not new; it's just that they are coming to life and ready for widespread implementation and adoption. As social cloud technologies get adopted, they will create a stable technology platform that the developer community and industry can build and grow on.
For a few reasons:
- There are enough users of social networking and there is enough evolution of the usage models to create the demand.
- There is enough services and applications and enough churn and instability in the industry to create the need for a stability and interoperability.
What does this mean?
This has a number of implications:
- It will become easier and more standardized for developers to build social networking cloud applications and services in a leveraged way.
- Social networking applications and services will become more interoperable.
- People will be able to have more control of their digital social networking lives.
Marhsall Kirkpatrick writes, Is a perfect storm forming for distributed social networking?
My answer is: Yes. And social cloud technology will be at the base of it.
What do you think? Is now the time for social cloud technology to be built into the fabric of the internet? What technologies are key to making the internet a stable social cloud technology platform?
For you tech historians and pundits out there, did I get this right or wrong?
I took a sabbatical from blogging for the last 1.5 years. It was not because I was making a statement. It was not because I switched to a flashier tool. It was just because I took a job that was not very conducive to blogging. In essence, I was immersed in a business VP role and there were too many sensitive issues I would have had to navigate around, so this made it difficult to write posts. Now I'm in a CTO role which I find to be much more conducive to blogging.
Since I had a year and a half away from the blogosphere, I had the opportunity to "see what changed" now that I'm back. I looked for my blogging tools tucked away in various corners of the internet, afraid of what I'd find as I reached into the cobwebs. Fortunately, or should I say unfortunately, I remembered most of my passwords. There they were, remnants of decayed accounts and dismal blog stats everywhere.
Regardless of my own situation, this did give me a chance to notice some significant changes that occurred in the blogging world over the last 1.5 years. Here are a few significant changes that I think are worth noting:
- The players have changed.
- The tools and methods have changed.
- Subscribers are dead. Actually, they're not dead, but they've taken a new form.
- Comments are dead. Actually, they're not dead, they just dispersed.
- Services churn.
Let's look at these a little more closely with a little Q&A.
Where are the old players?
The most important part of jumping back into blogging is reading them. I scratched my head to find my blog readers (in those dusty deep dark corners). Some blogs had hundred and thousands of posts that I had to catch up on (or just mark as read). But, a number of the blogs were dead or on sabbatical, like mine. Many people were much more thoughtful than me, in that instead of just falling off the face of the blogosphere they actually said they were taking time off. And these were pretty well-reputed, prolific bloggers who had followers and live conversations going on all the time.
Takeaway: Many of the old players are on sabbatical.
What are the new tools?
So now I had to go out and find new blogs to read. I went through the cobwebs again, but I realized that some older tools, like blog readers, seemed a bit stodgy and linear. The new way that I find and read blogs is by chasing a maze of links embedded in tweets on Twitter. Then, instead of subscribing to the blog's RSS or atom feed, I subscribe to the blog author's tweets.
Who are the new players?
So I twittered my way around the twitterverse which took me through part of the blogosphere and found some of the new players. I was pleasantly surprised to find the world of Gen Ys, also known as the 20-somethings, or the 80s (born in the 1980's, used in China). They're great. Honest, motivated, ambitious, and trying to make their way in this down economy. And there is this whole industry of career coaching- and yes, they're coaching each other. I'm pretty excited about this group and how they will change the world in their own ways.
Takeaway: The new players are Gen Y's who are about to step into their 30's.
Where are my subscribers?
Since I was away from blogging and because my blog address had changed (don't ask), I lost almost all of my subscribers. So, when I wrote a new post, I had to find a new way to let people know about it. What did I do? I used my swanky social networking tools to advertise it- I posted a link on Facebook and I tweeted a link on Twitter. While I don't have much of a Twitter following, my Facebook network is pretty rich. So, I did get a modest number of readers for my post.
Takeaway: Blog subscribers are gone. They are followers or friends instead.
Where have my comments gone?
As soon as I posted, I received a number of comments, but few appeared on my blog. Instead, the comments appeared on my Facebook page as Facebook comments. Also, one of my posts was forwarded, but not by a blog trackback, instead it was forwarded by a Twitter retweet. So, in essence, blog comments and trackbacks still exist, but they exist through other services such as Facebook and Twitter.
Takeaway: Blog comments have dispersed. They appear on Facebook, Twitter, and FriendFeed(++) instead.
Which service should I use? Who can I trust?
The other thing that is happening is that there are so many services and so much churn that the community doesn't know who to trust. Facebook bought FriendFeed. Tr.im closed down, well, almost. People don't know where to turn. They don't know where to invest and store their digital life. This will guarantee that the level of change that occurred in my 1.5 year sabbatical will happen even quicker in the years ahead.
Takeaway: Services churn! People who want to keep their digital lives (and their digital friends) will be responsible for carrying it forward themselves.
Is there any room for Old Timers?
Yes, there is a lot of room for the Old Timers. As the Gen Y's enter their 30's, they will have to interact with the 40-somethings, 50-somethings, and 60-somethings. They will look to the old-timers that carry wisdom and have maintained relevance to help them navigate in this new world. In many ways, the new world will be new. But in other ways, it will be back to the basics.
Takeaway: The wise and relevant old-timers will guide the young.
So that's my view on how blogging has changed in the last 1.5 years. Please wish me luck in finding my followers, friends, and comments.
What do you think? How has blogging has changed in the last 1.5 years?
ACM is the premier professional research society for computer scientists. I think it is quite a statement that the broader research community is recognizing design, experience, and human emotion as bona fide research topics. Cuteness is being recognized as research by the research community!
In my mind, user adoption is the ultimate indicator of a technology's success, and adoption is driven by having a great user experience. The research discussed in workshops like these will help us understand and eventually formalize the coupling of experience and technology. Understanding how to provoke human emotions like cuteness will help identify new research directions and drive technology adoption.
Congratulations to the researchers who were pushing these ideas in their work before it reached broader acceptance! For example, 2007 was the 25th anniversary of CHI. Clearly your efforts are paying off!
What do you think? Is studying "cuteness" research?
Feel free to leave a URL with your comments.
Mobile & media experiences connect people with each other, with information, and with their environment. Media is increasingly being delivered in packets over networks. This raises a number of questions for today's networks:
- How can we transport media packets?
- How can we adapt media packets for diverse clients?
- How can we protect media packets?
- What impact do globally distributed, immersive media environments have on media packet delivery systems?
- What role does context play in next-generation mobile media experiences?
Coupling experience and technology
I began by stressing the importance of coupling experience and technology. Rather than developing technology in a box, it is important to first consider the desired user experience and then develop the technologies that impact it. The most important factor for deciding whether a technology gets transferred to product is not how good the technology is, but rather how it impacts the user experience. I have been passionate about this theme for quite some time, and as time passes my passion for this only grows stronger.
The rest of my talk cycled between the following experiences and technologies.
Mobile & Media Experiences
- Experience #1: Mobile, Diverse, Interactive: Diverse mobile video clients, desktop video, living room video
- Experience #2: Immersive, Conversational, Worldwide: Halo collaboration experience, Panoply immersive gaming experience
- Experience #3: Pervasive, Personalized, Context-aware: Mediascapes context-aware multimedia experience
- Packet labeling & metadata
- Transcoding & Processing in the network
- Scalable Streaming
- Secure Scalable Streaming
- Multiple Distortion Measures
- Public & private domains
- Sensing context in the network
Experience #1: Mobile, Diverse, Interactive
Packet labeling & metadata: The main point is that we live in a distributed networked world where media packets will traverse distributed network elements with multiple owners and administrative domains and be processed by devices and equipment made by different manufacturers. In this highly distributed world, one important thing that we can do is smartly label our packets in hopes that over time the smart network elements along the way will use these labels to improve the overall quality of the user experience. The key design principle is to design packet labels that are 1) specific enough to be useful and 2) general enough to be understood.
Example packet labels and metadata include:
- Importance: Distortion values
- Time requirements: Time stamps
- Content type: Video, audio, text, data
- Scalability: Is it truncatable?
- Media attributes: spatial region, resolution, color; audio channel
- Dropability: Can it be dropped? e.g., Drop video for audio-only session.
- Processibility: Is it transcodable? Can it be processed?
- Security: What are the rights and privacy implications of the media?
Transcoding & Processing in the network
I discussed the experience of delivering media to and from users over any network and on any device. This motivates the technology of performing transcoding operations in the network. In 3G networks, the streaming, recording, and transcoding capabilities can be performed by the IMS Multimedia Resource Function (MRF), which serves and receives the media packets to and from the handsets. Dynamic transcoding can be used to adapt the video for the target client device (e.g., to lower the resolution) and for the network (e.g., to seamlessly handoff media between 3G and 2.5G networks during a mobile media session).
The research challenge that lies ahead is designing and developing transcoding algorithms in a manner that is computationally efficient so that a single transcoding node (e.g., IMS MRF) can process many streams at once to serve multiple clients at one time.
This brings us to a technology called scalable streaming that makes transcoding much more efficient by leveraging scalable coding methods. In essence, if scalable coding methods are used, then we can form scalable packets that pack scalable data, for example low, medium, and high resolution data, into the packet in a manner that allows it to be transcoded by simply truncating the packet. Furthermore, the scalable media packets can have packet labels that contain image metadata and truncation points that can be used by a scalable packet transcoder. The scalable packet transcoder is quite simple- it performs transcoding by simply reading the packet label and then truncating the packet as needed.
Research opportunities arise if the packet labels contain the distortion value of the particular media packet. If distortion values are included in the label, then they can be used as hints for rate-distortion optimized streaming algorithms and rate-distortion optimized transcoding algorithms to improve the quality of the user experience.
Secure Scalable Streaming
Another desired experience includes serving diverse clients while having end-to-end security. End-to-end security means that the media is protected in a manner that only allows the sender and allowed receivers to access the media, while delivering, storing, and transcoding the media packet over the network in a way that does not require decryption. It turns out that this can be achieved by using the same method as scalable streaming, where scalable packets are formed by leveraging scalable coding, and then coupling the packet formation with the encryption process. Specifically, encryption is applied to the packet in a manner that allows the packet transcoding operation to still occur by simple packet truncation. This can also leverage secure scalable image coding standards such as the newly created JPSEC standard for security of JPEG-2000 imagery.
Secure Scalable Streaming was published in ICASSP 2001 by Susie Wee and John Apostolopoulos.
Multiple Distortion Measures
I then described a new technology area that we are studying called Multiple Distortion Measures (MDM). This begins with the following observation: Consider a set of scalable media packets. Generally speaking, the best ordering of the packets is determined by the profit-to-size ratio (or distortion-to-size ratio, in tech terms, delta d over delta r). Surprisingly, we observed that the best ordering for low resolution display is NOT equal to the best ordering for high resolution display. The question that arises is how different are they?
I showed a graph from our ICASSP 2007 paper that shows the PSNR vs. Rate plot for the low resolution reconstructed image with packets ordered in the low-res optimal order and with packets in the high-res optimal order. It turns out that there are differences in performance of up to 4 dB. The graph aso showed the PSNR vs. Rate plot for the high resolution reconstructed image with packets ordered in the high-res optimal order and the low-res optimal order. It turns out that these can have differences of over 1 dB.
This raised a lot of interest from the crowd. I think we'll have lots of people researching MDMs in the years ahead.
This raises the idea of labeling scalable media packets with multiple distortion measures, specifically, with the distortion value of the packet with respect to the low resolution image, the medium resolution image, and the high resolution image. If the packet contains this information, then streaming algorithms can be developed to optimize the media delivery experience to users with diverse client devices.
Multiple Distortion Measures was published in ICASSP 2007 by Carri Chan, Susie Wee, and John Apostolopoulos.
The last part of the keynote focussed on experiences #2 and #3 to look at the impact of emerging applications on future packet networks.
Experience #2: Immersive, Conversational, and Worldwide
Delivering immersive, high-quality, worldwide experiences has a number of challenges for today's networks. The main problem is that network intelligence exists, but only in spots. For example:
- QoS exists in spots, but is not guaranteed from beginning to end.
- IPv6 exists in spots, but it is often tunneled over IPv4 and so is not available from beginning to end.
- Significant congestion can occur in peering points between administrative domains, and it is very common for packets to traverse administrative domains many times in a single session.
- Due to the sheer number of IP addressses, packets in countries such as India may go through many network address translations (NATs) before being delivered to the recipient.
Public & private domains
As a result, proprietary networks are being built to deliver guaranteed experiences. HP's Halo immersive collaboration experience is built on a proprietary network for that very reason.
In the long run, the right answer is to build out networks that contain IPv6 and QoS. However, until that occurs, there is likely to be a co-existence of public and proprietary networks.
This raises research opportunities of developing protocols and algorithms that improve media delivery over co-existing public and proprietary networks. This also motivates the need to develop packet labels that contain information that can be used by smarter network elements that understand them. And, this once again raises the design principle of designing the labels so that they are specific enough to be useful but general enough to be widely understood.
Experience #3: Pervasive, Personalized, Context-aware
Finally, I described Mediascapes as an example of pervasive, context-aware multimedia experiences. The main essence of Mediascapes is that it uses sensors to trigger multimedia experiences tied to your physical and personal context.
Sensing context in the network
This raises the question of using sensors to sense your context and getting the sensed context into packets that can be used by different applications and services. In the web world, the sensors may exist as GPS sensors, environmental sensors, or personal sensors. In the operator world the sensors may come through carrier-grade network elements as in IP Multimedia Subsystem (IMS) architectures. For example, IMS context can include location, presence, group lists, and subscriber info.
The key is to have the sensors provide context that is wrapped into packets in a manner that they can be easiliy used by applications and services. This raises the challenge of creating a semantic representation for sensed context. Again, like the packet labels, this must be designed in a manner that is both specific enough to be useful but general enough to be widely understood.
I'd like to take a moment to give special thanks to thank John Apostolopoulos, Carri Chan, Steve Froelich, Dave Penkler, Qibin Sun, and Zhishou Zhang for their contributions to various parts of this work!
Final note and questions
The audience was great and the talk seemed to generate lots of discussion throughout the workshop.
This was a fun topic to put together for the keynote and I'd like to develop it further. I'd love to hear your thoughts and ideas on any aspects of this.
What are your thoughts and comments on the life of a packet?
Did you attend the workshop and keynote? If so, what did you think?
I'd like to develop this further. Do you have any suggestions for improvements?
Please feel free to leave a URL with your comments.