First, you can't do this in a general way.
In my speech presented at NFT.nyc, I deployed on Ethereum Mainnet a smart contract which created 2^255 NFTs using the ERC-721 protocol. (Thank you, work sponsored by NFT.nyc and Chain 76. Link at http://nft.life.)
If you were planning to make a general database which held like (contract address, token ID, owner) then you would need more atoms than the universe has to store that in your database.
If you want to avoid "spam" projects like this as well as "fake" projects, and even "old versions" of projects (e.g. see how Su Squares re-deployed the project several times to new contract addresses)... then you are excluding a lot. Everybody I know who is working on that problem has decided to start with an allowlist and only check those tokens.
But, assume you really wanted a general solution, and you were willing to blocklist some NFTs which are impractical to index. And you only cared about ERC-721 (repeat similar approach for other standards). You would do this by one of these two approaches:
Play every block on the blockchain starting with when ERC-721 was finalized (June 22, 2018) and listen for the
Transfer(address, address, uint256)event. Then filter that list using #1 above.
0xcert has created a tool which does exactly what you are describing and unfortunately it does require more than one beefy server to run. It is at https://github.com/0xcert/ethereum-scanner and is not publicly available. They may be willing to license the tool out or sell it. (I advise 0xcert as one of my clients.)
Other commercial options like OpenSea, Etherscan, Azure Blockchain (announced EOL), and Mathematica have a great API available. Everyone of them includes limitations and does not have open source.
For metadata, you just query every token for the
tokenURI and record that information. Normalizing that data and making it useful would be one full time job. This is, again, why commercial providers like OpenSea will select only a relatively few token contracts to support.