A lot of the Spine runtimes are kind of semi-engine-agnostic.
This includes Spine-Unity, whose classes are mostly written in plain C# without using any extraneous library's classes. So it works with XNA/MonoGame, Unity and potentially other engines that run with .NET/Mono.
The Spine-Unity runtime is actually more like a bridge between UnityEngine functionality and Spine-C# functionality and mostly just handles engine-specific rendering (the use of components and the UnityEngine.MeshRenderer.)
The reason why it doesn't use UnityEngine.AnimationClip objects is because it uses Spine.Animation objects. This not only makes it easier for them to maintain Spine runtime's code but also gives it more control over how things are rendered, allowing features like free-form deformation, skinning, keyable draw order, etc to work.
The Spine.Animations are stored in the skeleton's json and deserialized at runtime into Spine.Animation objects, accessible through each SkeletonComponent/SkeletonAnimation's SkeletonData. Use SkeletonData's FindAnimation method to get references to them at runtume/loadtime.