This isn't a question about structures built out of cells. If a cell can easily be light sensitive, then of course a multicellular creature can be light sensitive.
A multicellular creature can easily evolve so that only some of its cells perform a particular function. It just requires a "switch", a chemical which is present in only a part of an embryo's body. Some function (such as light sensitivity) may become dependent on the switch. This dependency is to the creature's advantage, because it is wasteful if too many cells are turning food into light-sensitive chemicals.
So, the really basic question is about a single cell. We know why it would find vision useful. We know what chemicals are needed by eye spots in creatures alive today. Are they strange and hard to come by?
The answer is, no. It is thought that eye spots have been around for half a billion to a billion years. (The evidence from conserved homeobox genes says that the common ancestor of mammals and insects had a light detector.) But, photosynthesis is far older. We know this because of algae fossils, and from the idea that plants and bacteria would have inherited photosynthesis from a common ancestor. And besides, early life would have run out of organic molecules if it didn't learn to make its own with photosynthesis.
So, light sensitive chemistry was around. We can also assume that something like Vitamin A was around, since it is found in so many creatures. And, the light sensitive chemical humans use for "night vision" (rhodopsin) is just a variation on Vitamin A. There is nothing about this chemical that makes it hard for cells to make, and it resembles chemicals that have other uses. For example, one slight variation of rhodopsin is melanopsin, which is used by chameleons in their light-sensitive pigment cells.
Although black-and-white vision would have come first, the chemistry of color vision isn't very complicated either.